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1. Introduction 

A 1932 theorem of Mazur and Ulam |123] asserts that if X and Y 
are Banach spaces and / : X — )■ F is an onto isometry then / must 
be an affine mapping. The assumption that f{X) = Y is needed here, 
as exhibited by, say, the mapping t h- )■ (t, sint) from M to (M?, \\ ■ ||oo)- 
However, a major strengthening of the Mazur-Ulam theorem due to 
Figiel [nO] asserts that if / : X — >■ F is an isometry and /(O) = 
then there is a unique hnear operator T : span(/(X)) — )■ X such that 
||T|| = 1 and T{f{x)) = x for every x G X. Thus, when viewed as 
metric spaces in the isometric category, Banach spaces are highly rigid: 
their hnear structure is completely preserved under isometrics, and, in 
fact, isometrics between Banach spaces are themselves rigid. 

At the opposite extreme to isometrics, the richness of Banach spaces 
collapses if one removes all quantitative considerations by treating them 
as topological spaces. Specifically, answering a question posed in 1928 
by Frechet [62j and again in 1932 by Banach [18j, Kadec [92,^ proved 
that any two separable infinite dimensional Banach spaces are homeo- 
morphic. See [9ll EHl [271 H] for more information on this topic, as well 
as its treatment in the monographs [291 [55] . An extension of the Kadec 
theorem to non-separable spaces was obtained by Toruhczyk in [l83j|. 

If one only considers homeomorphisms between Banach spaces that 
are "quantitatively continuous" rather than just continuous, then one 
recovers a rich and subtle category that exhibits deep rigidity results 
but does not coincide with the linear theory of Banach spaces. We will 
explain how this suggests that, despite having no a priori link to Banach 
spaces, general metric spaces have a hidden structure. Using this point 
of view, insights from Banach space theory can be harnessed to solve 
problems in seemingly unrelated disciplines, including group theory, 
algorithms, data structures, Riemannian geometry, harmonic analysis 
and probability theory. The purpose of this article is to describe a 
research program that aims to expose this hidden structure of metric 
spaces, while highlighting some achievements that were obtained over 
the past five decades as well as challenging problems that remain open. 

In order to make the previous paragraph precise one needs to define 
the concept of a quantitatively continuous homeomorphisms. While 
there are several meaningful and nonequivalent ways to do this, we 
focus here on uniform homeomorphisms. Given two metric spaces 
{^A,dM) and {Af,dj\f), a bijection / : — )■ is called a uniform 
homeomorphism if both / and are uniformly continuous, or equiva- 
lently if there exist nondecreasing functions a, (3 : [0, oo) — (0, oo] with 
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limt_>o/3(t) = such that a(c?_A4(a, 6)) ^ dj^{f\a), f{h)) ^ f3{dM{cL,b)) 
for all distinct a,b G Ai. 

In the seminal 1964 paper |107] Lindenstrauss proved that, in con- 
trast to the Kadec theorem, there exist many pairs of separable infinite 
dimensional Banach spaces, including Lp{fi) and Lg^u) ii p ^ q and 
max{p, q] ^ 2, that are not uniformly homeomorphic. Henkin proved 
in [7H] that if n ^ 2 then C'^([0, 1]") is not uniformly homeomorphic 
to C^([0, 1]) for all /c G N (this result was previously announced by 
Grothendieck [7T] with some indication of a proof). Important work of 
Enflo |1HI HHl [50], which was partly motivated by his profound investi- 
gation of Hubert's fifth problem in infinite dimensions, obtained addi- 
tional results along these lines. In particular, in |19] Enflo completed 
Lindenstrauss' work |107] by proving that that Lp{fi) and Lq{u) are not 
uniformly homeomorphic if p ^ q and p,q & [1; 2], and in [50j he proved 
that a Banach space {X, \\ ■ \\x) which is uniformly homeomorphic to a 
Hilbert space (if, || ■ \\h) must be isomorphic to H, i.e., there exists a 
bounded /mear operator T : X ^ H such that ||Ta;||j7 ^ for all 

X G X. A later deep theorem of Johnson, Lindenstrauss and Schechth- 
man [89] makes the same assertion with Hilbert space replaced by ip, 
p G (0, oo), i.e., any Banach space that is uniformly homeomorphic to 
£p must be isomorphic to ip. At the same time, as shown by Aharoni 
and Lindenstrauss [T] and Ribe [17T] . there exist pairs of uniformly 
homeomorphic Banach spaces that are not isomorphic. 

In 1976 Martin Ribe proved |169] that if two Banach spaces are 
uniformly homeomorphic then they have the same finite dimensional 
subspaces. To make this statement precise, recall James' |H5] notion 
of (crude) finite representability: a Banach space (X, || • ||x) is said 
to be finitely representable in a Banach space [Y, \\ ■ ||y) if there ex- 
ists K G [l,c>o) such that for every finite dimensional linear sub- 
space F C X there exists a linear operator T : F ^ Y satisfying 
\\x\\x ^ ^ -f^llxllx for all x E F. For example, for all p G [1, oo] 

any Lp{fi) space is finitely representable in ip, and the classical Dvoret- 
zky theorem |17j asserts that Hilbert space is finitely representable in 
any infinite dimensional Banach space. If p, g G [1, oo] and p ^ q then 
at least one of the spaces Lp{fi), Lqijj) is not finitely representable in 
the other; see, e.g. |190j . 

Theorem 1.1 (Ribe's rigidity theorem [170]). If X and Y are uni- 
formly homeomorphic Banach spaces then X is finitely representable 
in Y and Y is finitely representable in X. 

Influential alternative proofs of Ribe's theorem were obtained by 
Heinrich and Mankiewicz [77] and Bourgain [33] . See also the treatment 
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in the surveys [52l [25] and Chapter 10 of the book [26]. In |170j Ribe 
obtained a stronger version of Theorem LI under additional geometric 
assumptions on the spaces X and Y. The converse to Ribe's theorem 
fails, since for p e [l,oo) \ {2} the spaces Lp(M.) and £p are finitely 
representable in each other but not uniformly homeomorphic; for p = 
1 this was proved by Enfio [25], for p G (1,2) this was proved by 
Bourgain [33], and for p G (2, oo) this was proved by Gorelik [S3] . 

Theorem |1.1| (informally) says that isomorphic finite dimensional 
linear properties of Banach spaces are preserved under uniform homeo- 
morphisms, and are thus in essence "metric properties" . For concrete- 
ness, suppose that X satisfies the following property: for every n G N 
and every xi, . . . , a;„ G X the average of || ± a;i ± 2:2 ± . . . ± Xn\\x over 
all the 2"' possible choices of signs is at most i^'dlxiH^ + . . . + ||a;„|||^), 
where K G (0, 00) may depend on the geometry of X but not on n 
and Xi, . . . ,Xn- Ribe's theorem asserts that if Y is uniformly homeo- 
morphic to X then it also has the same property. Rather than giving 
a formal definition, the reader should keep properties of this type in 
mind: they are "finite dimensional linear properties" since they are 
given by inequalities between lengths of linear combinations of finitely 
many vectors, and they are "isomorphic" in the sense that they are 
insensitive to a loss of a constant factor. Ribe's theorem is thus a re- 
markable rigidity statement, asserting that uniform homeomorphisms 
between Banach spaces cannot alter their finite dimensional structure. 

Ribe's theorem indicates that in principle any isomorphic finite di- 
mensional linear property of Banach spaces can be equivalently for- 
mulated using only distances between points and making no reference 
whatsoever to the linear structure. Recent work of Ostrovskii [1561 
11571 1159] can be viewed as making this statement formal in a certain 
abstract sense. The Ribe program, as formulated by Bourgain in 1985 
(see [31] and mainly [32]), aims to explicitly study this phenomenon. 
If parts of the finite dimensional linear theory of Banach spaces are in 
fact a "nonlinear theory in disguise" then if one could understand how 
to formulate them using only the metric structure this would make it 
possible to study them in the context of general metric spaces. As a 
first step in the Ribe program one would want to discover metric refor- 
mulations of key concepts of Banach space theory. Bourgain's famous 
metric characterization of when a Banach space admits an equivalent 
uniformly convex norm [32j was the first successful completion of a 
step in this plan. By doing so, Bourgain kick-started the Ribe pro- 
gram, and this was quickly followed by efforts of several researchers 
leading to satisfactory progress on key steps of the Ribe program. 
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The Ribe program does not limit itself to reformulating aspects of 
Banach space theory using only metric terms. Indeed, this should 
be viewed as only a first (usually highly nontrivial) step. Once this 
is achieved, one has an explicit "dictionary" that translates concepts 
that a priori made sense only in the presence of linear structure to the 
language of general metric spaces. The next important step in the Ribe 
program is to investigate the extent to which Banach space phenomena, 
after translation using the new "dictionary" , can be proved for general 
metric spaces. Remarkably, over the past decades it turned out that 
this approach is very successful, and it uncovers structural properties 
of metric spaces that have major impact on areas which do not have 
any a priori link to Banach space theory. Examples of such successes 
of the Ribe program will be described throughout this article. 

A further step in the Ribe program is to investigate the role of the 
metric reformulations of Banach space concepts, as provided by the 
first step of the Ribe program, in metric space geometry. This step 
is not limited to metric analogues of Banach space phenomena, but 
rather it aims to use the new "dictionary" to solve problems that are 
inherently nonlinear (examples include the use of nonlinear type in 
group theory; see Section 9.4). Moreover, given the realization that in- 
sights from Banach space theory often have metric analogues, the Ribe 
program aims to uncover metric phenomena that mirror Banach space 
phenomena but are not strictly speaking based on metric reformula- 
tions of isomorphic finite dimensional linear properties. For example, 
Bourgain's embedding theorem was discovered due to the investigation 
of a question raised by Johnson and Lindenstrauss [SS] on a metric 
analogue of John's theorem |H^. Another example is the investigation, 
as initiated by Bourgain, Figiel and Milman [34], of nonlinear versions 
of Dvoretzky's theorem [17] (in this context Milman also asked for a 
nonlinear version of his Quotient of Subspace Theorem |140] . a ques- 
tion that is studied in |125j ). Both of the examples above led to the 
discovery of theorems on metric spaces that are truly nonlinear and do 
not have immediate counterparts in Banach space theory (e.g., the ap- 
pearance of ultrametrics in the context of nonlinear Dvoretzky theory; 
see Section [s]), and they had major impact on areas such as approxi- 
mation algorithms and data structures. Yet another example is Ball's 
nonlinear version [15] of Maurey's extension theorem |120] . based on 
nonlinear type and cotype (see Section |4]). Such developments include 
some of the most challenging and infiuential aspects of the Ribe pro- 
gram. In essence, Ribe's theorem pointed the way to a certain analogy 
between linear and nonlinear metric spaces. One of the main features 
of the Ribe program is that this analogy is a source of new meaningful 
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questions in metric geometry that probably would not have been raised 
if it weren't for the Ribe program. 

Remark 1.1. A rigidity theorem asserts that a deformation of a cer- 
tain object preserves more structure than one might initially expect. 
In other words, equivalence in a weak category implies the existence of 
an equivalence in a stronger category. Rigidity theorems are naturally 
important since they say much about the structure of the stronger cat- 
egory (i.e., that it is rigid). However, the point of view of the Ribe 
program is that a rigidity theorem opens the door to a new research 
direction whose goal is to uncover hidden structures in the weaker cate- 
gory: perhaps the rigidity exhibited by the stronger category is actually 
an indication that concepts and theorems of the stronger category are 
"shadows" of a theory that has a significantly wider range of appli- 
cability? This philosophy has been very successful in the context of 
the Ribe program, but similar investigations were also initiated in re- 
sponse to rigidity theorems in other disciplines. For example, it follows 
from the Mostow rigidity theorem |143] that if two closed hyperbolic n- 
manifolds {n > 2) are homotopically equivalent then they are isometric. 
This suggests that the volume of a hyperbolic manifold may be general- 
ized to a homotopy invariant quantity defined for arbitrary manifolds: 
an idea that was investigated by Milnor and Thurston |141] and fur- 
ther developed by Gromov |66] (see also Sections 5.34-5.36 and 5.43 
in |69]). These investigations led to the notion of simplicial volume, a 
purely topological notion associated to a closed oriented manifold that 
remarkably coincides with the usual volume in the case of hyperbolic 
manifolds. This notion is very helpful for studying general continuous 
maps between hyperbolic manifolds. 

Historical note. Despite the fact that it was first formulated by 
Bourgain, the Ribe program is called this way because it is inspired 
by Ribe's rigidity theorem. I do not know the exact origin of this 
name. In [32j Bourgain explains the program and its motivation from 
Ribe's theorem, describes the basic "dictionary" that relates Banach 
space concepts to metric space concepts, presents examples of natural 
steps of the program, raises some open questions, and proves his metric 
characterization of isomorphic uniform convexity as the first success- 
ful completion of a step in the program. Bourgain also writes in [32] 
that "A detailed exposition of this program will appear in J. Linden- 
strauss's forthcoming survey paper [5]." Reference [5] in [32] is cited as 
"J. Lindenstrauss, Topics in the geometry of metric spaces, to appear." 
Probably referring to the same unpublished survey, in Bourgain 
also discusses the Ribe program and writes "We refer the reader to the 
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survey of J. Lindenstrauss [4] for a detailed exposition of this theme", 
where reference [4] of [31] is "J. Lindenstrauss, Proceedings Missouri 
Conf., Missouri - Columbia (1984), to appear." Unfortunately, Lin- 
denstrauss' paper was never published. 

This article is intended to serve as an introduction to the Ribe pro- 
gram, targeted at nonspecialists. Aspects of this research direction 
have been previously surveyed in ^ MM M ESI ESDI EHl M 
and especially in Ball's Bourbaki expose [I6]. While the material sur- 
veyed here has some overlap with these paper, we cover a substantial 
amount of additional topics. We also present sketches of arguments as 
an indication of the type of challenges that the Ribe program raises, 
and we describe examples of applications to areas which are far from 
Banach space theory in order to indicate the versatility of this approach 
to metric geometry. 

Asymptotic notation. Throughout this article we will use the no- 
tation <, > to denote the corresponding inequalities up to universal 
constant factors. We will also denote equivalence up to universal con- 
stant factors by x, i.e., A x 5 is the same as {A < B) A {A> B). 

Acknowledgements. This article accompanies the 10th Takagi Lec- 
tures delivered by the author at RIMS, Kyoto, on May 26 2012. I 
am grateful to Larry Guth and Manor Mendel for helpful suggestions. 
The research presented here is supported in part by NSF grant CCF- 
0832795, BSF grant 2010021, and the Packard Foundation. 

2. Metric type 

Fix a Banach space (X, || • \\x)- By the triangle inequality we have 
ll^iXi + . . . +enXn\\x ^ \\xi\\x + • • • + \\xn\\x for cvcry xi, . . . e X 
and every ei, . . . , e„ G { — 1, 1}. By averaging this inequality over all 
possible choices of signs £i, . . . ,£„ G { — 1, 1} we obtain the following 
randomized triangle inequality. 

- E 

ei,...,e„e{-l,l} 

For p ^ 1, the Banach space X is said to have Rademacher type p if 
there exists a constant T G (0, oo) such that for every n G N and every 
xi, . . . , x„ G X we have 



ei,...,£„e{-l,l} 



i=l 



E 

i=l 



^ T 



E 



(2) 



X 
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It is immediate to check (from the case of colhnear xi, . . . , x„) that if (|2]) 
holds then necessarily p ^ 2. If p > 1 and ^ holds then X is said 
to have nontrivial type. Note that if this happens then in most cases, 
e.g. if xi,...,Xn are all unit vectors, p]) constitutes an asymptotic 
improvement of the triangle inequality (flF. For concreteness, we recall 
that Lp{fj,) has Rademacher type min{p, 2}. 

Remark 2.1. A classical inequality of Kahane |i94|| asserts that for 
every g ^ 1 we have 



1 

2" 



,,£„e{-i,i} 



1/-? 



c{p) 
2" 



£l,--.,£ne{-l,l} 



X 



where c{p) G (0, oo) depends on p but not on ra, the choice of vec- 
tors xi,...,Xn € X, and the Banach space X itself. Therefore the 
property ^ is equivalent to the requirement 



1 



E 

ei,...,e„e{-l,l} 



E 



1/9 



1/p 



E 



(3) 



with perhaps a different constant T G (0, oo) 



The improved triangle inequality ^ is of profound importance to 
the study of geometric and analytic questions in Banach space theory 
and harmonic analysis; see [T21j and the references therein for more 
information on this topic. 

The Ribe theorem implies that the property of having type p is 
preserved under uniform homeomorphism of Banach spaces. According 
to the philosophy of the Ribe program, the next goal is to reformulate 
this property while using only distances between points and making no 
reference whatsoever to the linear structure of X. We shall now explain 
the ideas behind the known results on this step of the Ribe program as 
an illustrative example of the geometric and analytic challenges that 
arise when one endeavors to address such questions. 



2.1. Type for metric spaces. The basic idea, due to Enfio [49j, and 



later to Gromov (6 
follows. Given xi 



and Bourgain, Milman and Wolfson 
, x„ G X define / : {-1, 1}" -> X by 



IS as 



1^1, 



SiXi. 



(4) 



i=l 
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With this notation, the definition of Rademacher type appearing in (|2 
is the same as the inequahty 

E,[||/(.)-/(-.)|U] 

^ T f ^ [\\f{6) - f{e,, -e„ e.+i, ...,£„) ||^] j , (5) 

where E[-] denotes expectation with respect to a uniformly random 
choice of £ G {—1, 1}". 

Inequahty ^ seems to involve only distances between points, ex- 
cept for the crucial fact that the function / itself is the linear func- 
tion appearing in (|4]). Enflo's (bold) idea |19] (building on his earlier 
work ED] ) is to drop the linearity requirement of / and to demand 
that ([5]) holds for all functions / : {—1, 1}" — )■ X. Thus, for p ^ 1 we 
say that a metric space {J^,dM) has type p if there exists a constant 
T G (0, 00) such that for every n eN and every / : { — 1, 1}" M., 

E, [dM {f{e)J{-e))] 

\y2^e [dM if iE)Ji£l,-- -,£1-1, £n)f]] ■ (6) 

Remark 2.2. The above definition of type of a metric space is ad hoc: 
it was chosen here for the sake of simplicity of exposition. While this 
definition is sufficient for the description of the key ideas and it is also 
strong enough for the ensuing geometric applications, it differs from 
the standard definitions of type for metric spaces that appear in the 
literature. Specifically, motivated by the fact that Rademacher type 
p for a Banach space (X, || • \\x) is equivalent to ^ for any g ^ 1, 
combining the above reasoning with the case g = p in ([3]) leads to 
Enfio's original definition: say that a metric space {A4,dM) has Enflo 
type p if if there exists a constant T G (0, 00) such that for every n G N 
and every / : { — 1, 1}" — )■ M, 

E, [dM (/(£), /(-£)r] 

n 

^T^Y.^e (/(£),/(£!,..., 5,-1 m- (7) 

i=l 

Analogously, by ^ with q = 2 and Holder's inequality, if (X, || • ||x) 
has Rademacher type p G [1,2] then there exists a constant T G (0, 00) 
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such that for every n G N and every Xi, . . . , a:„ G X, 





n 


2 ■ 










i=l 





i=l 



Hence, following the above reasoning, Bourgain, Milman and Wolf- 
son suggested the following definition of type of metric spaces, 
which is more convenient than Enflo type for certain purposes: say that 
a metric space (A^, dj^) has BMW type p if if there exists a constant 
T G (0, oo) such that for every n eN and every / : { — 1, 1}" — )■ A^, 

n 

^ T^rip'^ [dM (/(e), f{£i, £i-i, Si, Si+i, ^n))^] • (8) 

1=1 

In [67] Gromov suggested the above definitions of type of metric spaces, 
but only when p = 2, in which case ([T]) and (|8| coincide. 

Remark 2.3. For the same reason that Rademacher type p > 1 should 
be viewed as an improved (randomized) triangle inequality, i.e., an 
improvement over ([T]), the above definitions of type of metric spaces 
should also be viewed as an improved triangle inequality. Indeed, it is 
straightforward to check that every metric space {Ai , d^ ) satisfies 

[dM {f{e)J{-e))] 

n 

^ ^ [dM (/(e), /(ei, • • • , ^i-i, -ei, Si+i, . . . , En))] (9) 

i=\ 

for every n G N and every / : { — 1,1}" — )■ Jv[. Thus every metric 
space has type 1 (equivalently Enflo type 1) with T = 1. A similar 
application of the triangle inequality shows that every metric space 
has BMW type 1 with T = 1. Our definition (|6]) of type of a metric 
space (tM, dM) is not formally stronger than ([9]), and with this in mind 
one might prefer to consider the following variant of ([6]): 



E, [dM (/(e), /(-£))] 

n 

Y<^M (/(e), /(ei. 



< E. 



i/p" 



, ej_i , s-i , , 



(10) 



Note that, by Jensen's inequality, (10) implies (|6]). We chose to work 
with the definition appearing in ([6]) only for simplicity of notation and 
exposition; the argument below will actually yield (10). 



AN INTRODUCTION TO THE RIBE PROGRAM 



11 



2.2. The geometric puzzle. One would be justified to be concerned 



about the "leap of faith" that was performed in Section 2J^ Indeed, 
if a Banach space (X, || ■ ||x) satisfies ^ for all linear functions as 
in Q there is no reason to expect that it actually satisfies ^ for all 



/ : { — 1, 1}"" — )• X whatsoever. Thus, for the discussion in Section 2.1 
to be most meaningful one needs to prove that if a Banach space has 
Rademacher type p then it also has type p as a metric space (resp. Enfio 
type p or BMW type p). This question, posed in 1976 by Enfio 
(for the case of Enfio type), remains open. 

Question 1 (Enflo's problem). Is it true that if a Banach space has 
Rademacher type p then it also has Enflo type p ? 

We will present below an argument that leads to the following slightly 
weaker fact: if a Banach space has Rademacher type p then for every 
e G (0,1) it also has type p — e as a metric space. We will follow an 
elegant argument of Pisier |164] . who almost solved Enfio's problem by 
showing that if a Banach space has Rademacher type p then it also 
has Enfio type p — e for every e G (0, 1). Earlier, and via a different 
argument, Bourgain, Milman and Wolf son proved that if a Banach 
space has Rademacher type p then it also has BMW type p — e for every 
e G (0, 1). More recently, |128] gave a different, more complicated 
(and less useful), definition of type of a metric space, called scaled 
Enflo type, and showed that a Banach space has Rademacher type p 
if and only if it has scaled Enfio type p. This completes the Ribe 
program for Rademacher type, but it leaves much to be understood, 
as we conjecture that the answer to Question [l] is positive. In \15Q\ 
ins 1149^ [55] it is proved that the answer to Question [l] is positive for 
certain classes of Banach spaces (including all Lp{fi) spaces). 

To better understand the geometric meaning of the above problems 
and results consider the following alternative description of the defi- 
nition of type of a metric space (A1,o?x). Call a subset of 2" points 
in A4 that is indexed by {—1,1}" a geometric cube in Ai. A diago- 
nal of the geometric cube {xe}ee{~i,i}'^ C is a pair {xe,xs} where 
e,6 E {—1, 1}" differ in all the coordinates (equiv. 6 = —e). An edge 
of this geometric cube is a pair {x^, xs} where e, 5 G { — 1, 1}" differ in 
exactly one coordinate. Then ^ is the following statement 

^ diagonal ^ ^ j' ^edgeP ^ ^^^^ 



where in the left hand side of (11) we have the sum of the lengths 



of all the diagonals of the geometric cube, and in the right hand side 



of ( 11 ) we have the sum of the pth power of the lengths of all the edges 
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of the geometric cube. The assertion that (A^,(i_A/() has type p means 
that (11) holds for all geometric cubes in M.. 

If (X, II ■ II x) is a Banach space with Rademacher type p then we 
know that (11) holds true for all parallelepipeds in X, as depicted in 
Figure [T] 




Figure 1. X having Rademacher type p is equiv- 
alent to the requirement that (11) holds true for 
every parallelepiped in X, i.e., a set of vectors 
{x5}5g{o,i}" where for some xi, . . . , x„ G X we have 
^& = Yli=i ^i^i for all (5 = . . . , 5n) e {0, 1}". 



The geometric "puzzle" is therefore to deduce the validity of (11) for 
all geometric cubes in X (perhaps with p replaced hj p — e) from the 
assumption that it holds for all parallelepipeds. In other words, given 
Xi, . . . , G X, index these points arbitrarily by { — 1, 1}". Once this 
is done, some pairs of these points have been declared as diagonals, 
and other pairs have been declared as edges, in which case (11 ) has to 
hold true for these pairs; see Figure [2] 



2.3. Pisier's argument. Our goal here is to describe an approach, 
devised by Pisier in 1986, to deduce metric type from Rademacher 
type. Before doing so we recall some basic facts related to vector-valued 
Fourier analysis on {—1, 1}". The characters of the group { — 1, 1}" 
(equipped with coordinate-wise multiplication) are the Walsh functions 
{WA}Ac{i,...,n}, where Wa(^) = HieA^i- ^ix a Banach space (X, || -Hx)- 
Any function / : { — 1,1}"' — J-X has the Fourier expansion 

f{e)= Yl fi^)WA{e), 

AC{l,...,ra} 
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FlGURE 2. A schematic illustration of the problem 
when n = 3. Given xi, . . . ,Xs E X , we index them 
using the labels {(ei, £2, £3) '■ £1,^2, £3 ^ 1}} as 
depicted above. Once this is done, the dotted lines 
represent diagonals and the full hues represent edges. 



where 

e=(ei,...,e„)G{-l,l}" «GA 

For J G {1, . . . , n} define (9^/ : {-1, 1}" ^ X by 

Q.j(^^^ — /(^) ~ /(^l? ■ • • ; ^j-1; ^j+1; • • • ; ^w) 

= /(^)w^^(^)- (12) 

AC{l,...,n} 

The hypercube Laplacian of / is given by 

n 

Af{e) = Ydjm= Y immAie). 

j=l AC{l,...,n} 

The associated time-t evolute of / under the heat semigroup is 

e-'^f{e)= Y e-'\^^fiA)WAis). (13) 

AC{l,...,n} 

Since the operator e^*^ coincides with convolution with the Riesz kernel 
Rt{£) = YTi=ii^ + 6 which for t ^ is the density of a probability 
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measure on { — 1, 1}", we have by convexity 

t ^ ^ [\\e-'''m\\^] ^ [WmWx] . (14) 

It immediately follows from ( |T3| ) that 

e-*^ (l^{i,...,n}e-*^/) = e-*"W^{i„„,„|/. (15) 

Consequently, we deduce from (15) and (14) that 

t ^ ^ E, [\\e~'^fie)\\^] > e-"% [||/(£)l|x] • (16) 

Fix s > that will be determined later. Let g* : ^ X* he 

a normalizing functional of e~*^/ — /(0) G Li({ — 1, Ij^jX), i.e., 

vee{-i,ir, ik:(e)iu*^i, (17) 

and 

Ee [||e-^^ (/(e) - /(0)) II J = E, [glie) (e^^ (/(.) - /(0)))^ 

= E e^^'^'^K^) (/(^)) • (18) 

AC{l,...,n} 

In |164j . Pisier succeeds to relate general geometric cubes in X to par- 
allelepipeds in X by interpolating ^f* between two hypercubes. Specif- 
ically, for every t ^ consider the function 

(<7:),:{-i,ir x{-i,ir^x* 

given by 

i9:Ue,6)= Yl ^1(^)11 (^"*^^ + (l-^"*)'^0- (19) 

AC{l,...,n} iGA 

Equivalently, {g*)t{e,6) = g* {e~^e + {1 — e~^)6), where we interpret 
the substitution of the vector e~^6 + (1 — e^*)(5 G M" into the function 
g*, which is defined a priori only on { — 1, 1}", by formally substituting 
this vector into the Fourier expansion of g*. 

Yet another way to interpret {gl)t{e,S) is to note that for every 
AC{l,...,n}, 

n 

n {e-'e, + (1 - e-%) = WA{e) J] (e"* + (1 - e-*)(£,5,)i-«) 

ieA 1=1 

= WA{e) Yl e-*l^l(l-e-*)"-l^lH^A.B(^5) 

BC{l,...,n} 

= e-'\''\{l-e-'r-\^WAnB{e)WAMS). (20) 

BC{l,...,n} 
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Hence, by substituting (20) into (19) we have 

= e~'\^Kl-e-r-^^\9:(j2e.e.+ ^ S.e] , (21) 

BC{l,...,n} \ieB j6{l,...,n}\B J 

where ei, . . . , e„ is the standard basis of M". In particular, it follows 



from (17) and (21) that for every e, 5 G { — 1, 1}*^ 



n—k 



(22) 



By directly expanding the products in ( 19 ) and collecting the terms 
that are linear in the variables (5i, . . . , 5„), we see that 

n 

= (e*-i)E^^ E e"'^'*^l(^)w^A.«(e) + $:,(e,5), (23) 

«=1 AC{l,...,n} 

where the error term $* j(e, 5) G X* satisfies 



,i=l 



(24) 



for all e G {—1,1}" and all choices of vectors G X. By 



from (|23]) that 



substituting a;i = Sidifie) into (24), and recalling (12), we deduce 



j=l AC{l,...,n} 



(e*-l) E l^|e-*l^lf:(A) (/(A) 



(25) 



AC{l,...,n} 



Recalling (|22|) we see that 



(^7:),(e,5) E'^'^^^^/(^) 



^ E,E5 



X. 



(26) 
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Hence, 
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E 



e"'^'g:{A)(f{A) 



AC{l,...,n} 



(|25|A||26|| 



r( E \A\e-'\^\9t{A) (f{A)) \ dt 

\AC{l,...,n} 

n 



log 



1 



■t=i 

n 



1=1 




(27) 



Recalling (16), it follows from (27) that 



Ee[||/(£)-E, 



^ e"^ lo£ 



E,Ea 



(2^ 



By choosing s x " go as to minimize the right hand side of (28), 



n log ri 

Es[\\f{e)-Mf{mx 



^ (logn + O (log log n)) E^E^ 



i=l 



X. 



(29) 



If X has Rademacher type p > 1, i.e., it satisfies (|2|, then 
[||/(5)-/(-5)||^]^2E, 01/(5) -E,[ 



< 



T(logn)E, 



i/p" 



< r(logn) K^E,[||/(£)-/(£i,..., 5 



.(30) 



This proves that if X has Rademacher type p then it almost has type p 
as a metric space: inequality ([6]) holds with an additional logarithmic 
factor. We have therefore managed to deduce the fully metric "di- 
agonal versus edge" inequality (11) from the corresponding inequality 
for parallelepipeds, though with a (conjecturally) redundant factor of 
logn. Using similar ideas, for every e G (0, 1) one can also deduce the 
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validity of the Enflo type p condition ([t]) without the logn term but 
with p replaced by p — e and the implied constant depending on e. See 
Pisier's paper [164] for the proof of this alternative tradeoff. A similar 
tradeoff was previously proved for BMW type using a different method 
by Bourgain, Milman and Wolfson [35j. 

Remark 2.4. An inspection of the above argument reveals that there 
exists a universal constant C G (0, cxo) such that for every Banach space 
(X, II ■ \\x), every q G [1, oo], every n eN, and every / : { — 1, 1}" — ?■ X 
we have 



{^e[\\f{e)-Es[m]rx]) 



1/9 



^ C(logn) 



1 1 



X. 



1/9 



(31) 



Inequality (31) was proved in 1986 by Pisier |164j . and is known today 
as Pisier's inequality. Removal of the logn factor from (31) for Banach 



spaces with nontrivial Rademacher type would yield a positive solution 
Enflo's problem (Question [T]). Talagrand proved |179] that there exist 
Banach spaces for which the logn term in (31) cannot be removed, but 
we conjecture that if {X, \\ ■ \\x) has Rademacher type p > 1 then the 
logn term in (31) can be replaced by a universal constant (depending 
on the geometry of X). In |179j it was shown that the logn term 
in (31 ) can be replaced by a universal constant if X = M, and in |188] 
it was shown that this is true for a general Banach space X if g = oo. 



In |15Ul IH3] it is shown that the logn term in (31 ) can be replaced by 
a universal constant for certain classes of Banach spaces that include 
all Lp(/i) spaces, p G (1, oo). 

2.4. Unique obstructions to type. There is an obvious obstruction 
preventing a Banach space (X, || ■ ||x) from having any Rademacher type 
p > 1: if X contains well-isomorphic copies of = (M", || ■ ||i) for all 
n G N then its Rademacher type must be trivial. Indeed, assume that 
(X, II ■ \\x) satisfies ^ and for n G N and D G (0, oo) suppose that there 
exists a linear operator A : 
for all X E ii- Letting Si, . 
that for Xi = Ad we have 



Xi 



X satisfying ||x||i ^ ||^a;|U ^ -^ll^^lli 
, En be the standard basis of M", it follows 
|x ^ and 

+ BnXnWx = 11^(^^161 H h SnCn) \\x > n. 

These facts are in conflict with (|2|, since they force the constant T 
appearing in (|2| to satisfy 



V£G{-l,ir, \\eix,+ 



T ^ 



i-i 
n p 



D 



(32) 
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Pisier proved |161] that the well-embeddabihty of is the only 

obstruction to nontrivial Rademacher type: a Banach space (X, || ■ ||x) 
fails to have nontrivial type if and only if for every e G (0, 1) and every 
n e N there exists a linear operator A X satisfying ^ 

^ (1 + for all x G In other words, once we know 

that X does not contain isomorphic copies of we immediately 

deduce that the norm on X must satisfy the asymptotically stronger 
randomized triangle inequality ([2]). 

As one of the first examples of the applicability of Banach space 
insights to general metric spaces, Bourgain, Milman and Wolfson [35] 
proved the only obstruction preventing a metric space (A^,(ix) from 
having any BMW type p > 1 is that Ai contains bi-Lipschitz copies of 
the Hamming cubes {({ — 1, l}", || ■ 

To make this statement precise it would be useful to recall the fol- 
lowing standard notation from bi-Lipschitz embedding theory: given 
two metric space {Ai,d_M) and {Af,dj^), denote by 

C{M,dM)i-^^d,^f) (33) 

(or Cm{-^) if the metrics are clear from the context) the infimum over 
those D G [l,oo] for which there exists f : M ^ M. and a scaling 
factor A G (0, oo) satisfying 

Vx,y G A/", \dj^{x,y) ^ dM{f{x)J{y)) ^ D\dj^{x,t). 

This parameter is called the Ai distortion of A^. When A4 is a Hilbert 
space, this parameter is called the Euclidean distortion of A/". 

Suppose that p > 1 and (A4, d_M) satisfies any of the type p inequal- 
ities (§, (Q or ([s]) (i.e., our definition of metric type, Enfio type, or 
BMW type, respectively). If c_a/(({ — 1, 1}", || ■ ||i) < D then there exists 
/ : {-1, l}*^ -)■ M and A > such that 

Ve, S G {-1, ir, \\\e -5h^ dMifie), f{5)) ^ D\\\s - 5]],. 

It follows that (i_A4(/(e), /(ei, . . . , ej, . . . , £„) ^ 2D\ for all 
e G {-1,1}" and i G {l,...,n}. Also, dM{f{e),f{-e)) ^ 2nX for all 
e G { — 1, 1}". Hence any one of the nonlinear type conditions ([6]), ([T]) 
or ([s]) implies that 

1-1 

CA.({-1,1}M|-Ili)^^- (34) 

Bourgain, Milman and Wolfson proved [35] (see also the exposition 
in [164|) that a metric space {Ai,dM) fails to satisfy the improved 
randomized triangle inequality ^ if and only if c_a4({— 1, 1}", || ■ ||i) = 1 
for all n G N. It is open whether the same "unique obstruction" result 
holds true for Enfio type as well. 
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We note in passing that it follows from ( 30 ) and ( 34 ) that if (X, 1 1 ■ 1 1 x ) 
is a normed space with type p > 1 then 

cx({-l,l}M|-l|i)>x^^, (35) 

logn 

where the implied constant may depend on the geometry of X but not 



on n. In combination with (32), we deduce that cx{ii) and cx({ — 1, 1}") 
have the same asymptotic order of magnitude, up to a logarithmic term 
which we conjecture can be removed. This logarithmic term is indeed 
not needed if X is an Lp{fj,) space, as shown by Enflo jl9] for p e (1, 2] 
and in |150j for p G (2, oo) (alternative proofs are given in [99l I149j ). 
It is tempting to guess that ({ — 1, !}"■, || • ||i) has (up to constant fac- 
tors) the largest ip distortion among all subsets of ii of size 2". This 
stronger statement remains a challenging open problem; it has been 
almost solved (again, up to a logarithmic factor) only for p = 2 in [H]. 



3. Metric cotype 



The natural "dual" notion to Rademacher type, called Rademacher 
cotype, arises from reversing the inequalities in (|2) or (|3| (formally, 
duality is a subtle issue in this context; see |122[ I163j ). Specifically, say 
that a Banach space (X, || ■ ||x) has Rademacher cotype q e [1, c>o] if 
there exists a constant C G (0, oo) such that for every G N and every 
xi, . . . , x„ G X we have 



E 



Xi 



^ CE 



E 



X. 



It is simple to check that if (36) holds then necessarily g G [2, oo] 



(36) 
and 

that every Banach space has Rademacher cotype oo (with C = 1). As 
in the case of Rademacher type, the notion of Rademacher cotype is of 
major importance to Banach space theory; e.g. it affects the dimension 
of almost spherical sections of convex bodies [61]. For more information 
on the notion of Rademacher cotype (including a historical discussion) , 
see the survey |121] and the references therein. 

As explained in Remark |2.1 , Kahane's inequality implies that the 
requirement (36) is equivalent (with a different constant C) to the 
requirement 



E 



E 



(37) 
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For simplicity of notation we will describe below metric variants of (37) 



though the discussion carries over mutatis mutandis also to the natural 



analogues of (36). 



In Banach spaces it is very meaningful to reverse the inequality in 
the definition of Rademacher type, but in metric spaces reversing the 
the inequality in the definition of Enflo type results in a requirement 
that no metric space can satisfy unless it consists of a single point (the 
same assertion holds true for our definition of metric type ^ and BMW 
type ([s]), but we will only discuss Enflo type from now on). Indeed, 
assume that a metric space (A^,(ix) satisfies 

n 

^ [dM ifie), fiei, • • • , £i-i, -£i, ^i+i, • • • , £n)y] 

i=l 

^Cm,[dM{f{e)J{-e)r]. (38) 
For all / : { — 1,1}" — ?■ X. If contains two distinct point a;o,2/o 



then apply (38) to a function / : { — 1, 1}" — )■ {xo,yo} chosen uniformly 



at random from the 2^" possible functions of this type. The right 



hand side of (38) will always be bounded by C''(ix(xo, yo)'', while the 



expectation over the random function / of the left hand side of (38) is 
|(ix(a;o, yo)'^- Thus necessarily C > n^^'^. 

In |130j the following definition of metric cotype was introduced. A 
metric space (A^,(i_yv() has metric cotype q if there exists a constant 
C G (0, oo) such that for every n G N there exists an even integer 
m eN such that every / : — )■ M. satisfies 



n 



j=l 



{Cm] 



g 



^'-^ E J2^M{f{x + e)J{x)y. (39) 

££{-1,0,1}" as-GZ^ 

Here ei,...,e„ are the standard basis of the discrete torus and 
addition is performed modulo m. The average over e G {—1,0, 1}" on 



the right hand side of (39) is natural here, as it corresponds to the ioo 
edges of the discrete torus. 

It turns out that it is possible to complete the step of the Ribe pro- 
gram corresponding to Rademacher cotype via the above definition of 
metric cotype. Specifically, the following theorem was proved in |130j . 

Theorem 3.1. A Banach space (X, || • ||x) has Rademacher cotype q 
if and only if it has metric cotype q. 
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The definition of metric cotype stipulates that for every n G N there 



exists an even integer m G N for which (39) holds true, but for certain 



applications it is important to have good bounds on m. The argument 



that was used above to rule out (38) shows that if {Ai,dM) contains 
at least two points then the validity of (39) implies that m > ■n}^'^ 



In |130j it was proved that one can ensure that m has this order of 
magnitude if X is Banach space with nontrivial Rademacher type. 

Theorem 3.2. Let (X, || ■ ||x) he a Banach space with Rademacher 
cotype q < oo and Rademacher type p > 1. Then (39) holds true 
for some even integer m ^ nn^/'^ , where n G (0, oo) depends on the 
geometry of X but not on n. 



As an example of an application of Theorem 3.2, the following char- 



acterization of the values of g G [1, oo) for which Lp[0, 1] is uniformly 
homeomorphic to a subset of Lg[0, 1] was obtained in jl30j . answering 
a question posed by Enfio |52] in 1976. 

Theorem 3.3. Fix p,q G [l,oo). Then Lp[0,l] is uniformly homeo- 
morphic to a subset of Lq[0, 1] if and only if either p ^ q or p,q & [1, 2]. 



An analogous result was proved for coarse embeddings in |130] and 
for quasisymmetric embeddings in |145j . answering a question posed 
by Vaisala |185j . The link between Theorem 3.2 and these results is 
that one can argue that if {M.,dM) satisfies (39) with m < n^^'' then 
any Banach space that embeds into A4 in one of these senses inherits 
the cotype of Ai. Thus, metric cotype (with appropriate dependence 
of m on n) is an obstruction to a variety of weak notions of metric em- 
beddings. The following natural open question is of major importance. 



Question 2. Is it possible to obtain the conclusion of Theorem 3^ 
without the assumption that X has nontrivial Rademacher type? In 
other words, is it true that any Banach space {X, \\-\\x) with Rademacher 
cotype q < oo satisfies (39) with m <x n^^'^? 



We conjecture that the answer to Question [2] is positive, in which 
case metric cotype itself, without additional assumptions, would be 
an invariant for uniform, coarse and quasisymmetric embeddings. For 
a general Banach space (X, || ■ \\x) of Rademacher cotype q the best 



known bound on m in terms of n in (39), due to [63], is m < 



There are additional applications of metric cotype for which the de- 



pendence of m on n in (39) has no importance. In analogy to the dis- 



cussion in Section 2.4, it was proved by Maurey and Pisier |122j that 
the only obstruction that can prevent a Banach space {X, \\ ■ \\x) from 
having finite Rademacher cotype is the presence of well-isomorphic 
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copies of {^J^jJ^i- In |130] a variant of the definition of metric cotype 
was given, in analogy to the Bourgain-Milman-Wolfson variant of Enfio 
type, and it was shown that a metric space has finite metric cotype in 
this sense if and only if cx({l, . . . ,m}"', || ■ ||oo) = 1 for every m, n G N. 
This nonlinear Maurey-Pisier theorem was used in |130] to prove the fol- 
lowing dichotomy result for general metric spaces, answering a question 
posed by Arora, Lovasz, Newman, Rabani, Rabinovich and Vempala [7] 
and improving a Ramsey-theoretical result of Matousek |116j . 

Theorem 3.4 (General metric dichotomy |130j ). Let be a family of 
metric spaces. Then one of the following dichotomic possibilities must 
hold true. 

• For every finite metric space {Ai,djii) and for every e G (0, oo) 
there exists J\f E J-" such that 

• There exists a{J^),K{J^) G (0, cxd) and for each n E N there 
exists an n-point metric space {J^n,dM„) such that for every 
Af E we have 



cjv{Mn) ^ K(J')(logra' 



We refer to |129[ Sec. 1.1] and [133J, as well as the survey paper [124J, 
for more information on the theory of metric dichotomies. Theorem |3.4| 
leaves the following fundamental question open. 

Question 3 (Metric cotype dichotomy problem |130l 1133"] ). Can one 



replace the constant a{J-') of Theorem 3.4 by a constant a G (0, oo) 
that is independent of the family J^? It isn't even known if one can 
take a (J-") = 1 for all families of metric spaces T . 

4. Markov type and cotype 

As part of his investigation of the Lipschitz extension problem |15j . 
K. Ball introduced a stronger version of type of metric spaces called 
Markov type. Other than its applications to Lipschitz extension, the 
notion of Markov type has found many applications in embedding the- 



ory, some of which will be described in Section 9.4 

Recall that a stochastic process {Zj}^q taking values in {1, . . . ,n} 
is called a stationary reversible Markov chain if there exists an n by 
n stochastic matrix A = (aij) such that for every t G N U {0} and 
every i,j G {1, . . . ,n} we have Pr [Zt+i = j\Zt = i] = aij, for every 
i G {1, . . . ,n} the probability vTj = Pr[Zt = i] does not depend on t, 
and HiQij = Tijaji for all i,j G {1, . . . , n}. 
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A metric space {Ai, cIm) is said to have Markov type p G (0, oo) with 
constant M G (0, oo) if for every n G N, every stationary reversible 
Markov chain on {1, ... , n}, every / : {1, . . . ,n} ^ Ai and every time 
t G N we have 

E [dMifiZt), fiZoW] ^ MHE [dMifiZ,), fiZo)y] . (40) 

Note that the triangle inequality implies that every metric space has 
Markov type 1 with constant 1. Ball proved [15] that if p G [1,2] then 
any Lp{fi) space has Markov type p with constant 1. Thus, while it is 
well-known that the standard random walk on the integers is expected 
to be at distance at most \/i from the origin after t steps. Ball estab- 
lished the less well-known fact that any stationary reversible random 
walk in Hilbert space has this property. If a metric space has Markov 
type p then it also has Enflo type p, as proved in |150] . In essence. 



Enflo type p corresponds to (40) in the special case when the Markov 
chain is the standard random walk on the Hamming cube { — 1,1}". 
Thus the Markov type p condition is a strengthening of Enflo type, 
its power arising in part from the flexibility to choose any stationary 
reversible Markov chain whatsoever. 

Remark 4.1. We do not know to what extent Enflo type p > 1 implies 
Markov type p (or perhaps Markov type q for some 1 < q < p). When 
the metric space {Ai,dM) is an unweighted graph equipped with the 
shortest path metric (as is often the case in applications), it is natural to 
introduce an intermediate notion of Markov type in which the Markov 



chains are only allowed to "move" along edges, i.e., by considering (40) 
under the additional restriction that if aij > then {f{i),f{j)} is an 
edge. Call this notion "edge Markov type p". For some time it was 
unclear whether edge Markov type p implies Markov type p. However, 
in |147] it was shown that there exists a Cayley graph with edge Markov 
type p for every 1 < p < | that does not have nontrivial Enflo type. It 
is unknown whether a similar example exists with edge Markov type 2. 

In |149j it was shown that for p G [2, oo) any Lp{fi) space has 
Markov type 2 (with constant M x y/p). More generally, it is proved 
in |149] that any p-uniformly smooth Banach space has Markov type 
p. Uniform smoothness, and its dual notion uniform convexity, are de- 
fined as follows. Let (X, || • \\x) be a normed space with unit sphere 
Sx = {x G X : ||x||x = l}- The modulus of uniform convexity of X is 
defined for e G [0,2] as 

6x{e) = inf |l - ; x,y e Sx, \\x-y\\x = e\- (41) 
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X is said to be uniformly convex if 6x{£) > for all e G (0,2]. X 
is said to have modulus of uniform convexity of power type q if there 
exists a constant c G (0, oo) such that 6x{£) ^ ce'^ for all e G [0,2]. 
It is straightforward to check that in this case necessarily q ^ 2. The 
modulus of uniform smoothness of X is define for r G (0, oo) as 

. .dci (\\x + ry\\x + \\x-Ty\\x rAo\ 
Pxiv = < 1 : x,y e Sx> . (42) 



X is said to be uniformly smooth if lim^_j.o Px(t)/t = 0. X is said to 
have modulus of uniform smoothness of power type p if there exists a 
constant C G (0, oo) such that Px(t) ^ Cr^ for all r G (0,oo). It is 
straightforward to check that in this case necessarily p E [1,2]. 

For concreteness, we recall |74j (see also ^7\) that if p G (1, oo) 
then Se (e) >p £:max{p,2} ^^^^ pe (t) <„ r™™^^'^^. The moduli appearing 



in (41) and (42) relate to each other via the following classical duality 



formula of Lindenstrauss |106j : 

px*(r) = sup |y-5x(£): £G[0,2]}. (43) 

An important theorem of Pisier |162j asserts that X admits an equiv- 
alent uniformly convex norm if and only if it admits an equivalent 
norm whose modulus of uniform convexity is of power type q for some 
q G [2, oo). Similarly, X admits an equivalent uniformly smooth norm 
if and only if it admits an equivalent norm whose modulus of uniform 
smoothness is of power type p for some p G (1,2]. 

We will revisit these notions later, but at this point it suffices to say 
that, as proved in |149j . any Banach space that admits an equivalent 
norm whose modulus of uniform smoothness is of power type p also has 
Markov type p. The relation between Rademacher type p and Markov 
type p is unclear. While for every p G (1,2] there exist Banach spaces 
with Rademacher p that do not admit any equivalent uniformly smooth 
norm [Sni UtiS] , the following question remains open. 

Question 4. Does there exists a Banach space {X, \\ ■ \\x) with Markov 
type p > 1 yet {X, \\ ■ \\x) does not admit a uniformly smooth norm? 

In addition to uniformly smooth Banach spaces, the Markov type of 
several spaces of interest has been computed. For example, the follow- 
ing classes of metric spaces are known to have Markov type 2: weighted 
graph theoretical trees |149] . series parallel graphs [37], hyperbolic 
groups |149j . simply connected Riemmanian manifolds with pinched 
negative sectional curvature |149] . Alexandrov spaces of nonnegative 
curvature |154j . Also, the Markov type of certain p-Wasserstein spaces 
was computed in 
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Recall that a metric space (M, dj^) is doubling if there exists K E'N 
such that for every x G and r G (0, oo) there exist yi, . . . ,yK ^ -M 
such that B{x, r) C r/2) U . . . U B{yK,r /2), i.e., every ball in Ai 

can be covered by K balls of half the radius. Here, and in what follows, 
B{z,p) = {w e M : dMiz,w) ^ p} for all z e M and p ^ 0. The 
parameter K is called a doubling constant of d^v/f). 

Question 5. Does etJery doubling metric space have Markov type 2? 
Specifically, does the Heisenberg group have Markov type 2? 

Assouad's embedding theorem [S] says that if {Ai,dM) is doubling 
then the metric space {A4, d]^^) admits a bi-Lipschitz embedding into 
Hilbert space for every e G (0,1). As observed in |149] . this implies 
that if (A^,(i^) is doubling then it has Markov type p for all p < 2. 
It was also shown in |149j that if {A4 , dj^ ) is doubling with constant 
K G (1, oo) then for every n G N, every stationary reversible Markov 
chain on {1, ... , n}, every / : {1, . . . , n} — ?■ and every time t G N, 



Vm>0, Pr 



dM{f{Zt)J{Zo))^uVi 



^ OiilogKf) ^ f{Zo)r] . (44) 

Thus, one can say that doubling spaces have "weak Markov type 2". 
Using the method of |166j it is also possible to show that doubling 
spaces have Enflo type 2. 

Further support of a positive answer to Question |5] was obtained 
in |149j . where it was shown that the Laakso graphs {Gfcj^o have 
Markov type 2. These graphs are defined |lUlj iteratively by letting 
Go be a single edge and Gj+i is obtained by replacing the middle third 
of each edge of Gi by a quadrilateral; see Figure |4| Equipped with their 
shortest path metric, each Laakso graph Gk is doubling with constant 
6 (see the proof of |102l Thm. 2.3]), yet, as proved by Laakso [TOT| . 
we have limfc_>.oo Q2(Gfc) = oo (in fact |102[ Thm. 2.3] asserts that 
Ci^{Gk) ^ \/k)- The graphs {Gk}'k=o among the standard examples 
of doubhng spaces that do not well-embed into Hilbert space, yet, as 
proved in |149j . they do have Markov type 2. The Heisenberg group, 
i.e., the group of all 3 by 3 matrices generated by the set 



S 



and equipped with the associated word metric, is another standard 
example of a doubling space that does not admit a bi-Lipschitz embed- 
ding into Hilbert space |16mil75j . However, as indicated in Question [sj 
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Go 





Figure 3. The first four Laakso graplis. 



tlie intriguing problem wlietlier tlie Heisenberg group lias Markov type 
2 remains open. 

Note that by the nonlinear Maurey-Pisier theorem |130j , as discussed 
in Section [3} a doubling metric space must have finite metric cotype. 
The Laakso graphs {Gfej^Q, being examples of series parallel graphs, 
admit a bi-Lipschitz embedding into £i with distortion bounded by a 
constant independent of k, as proved in [73]. Since £i has Rademacher 
cotype 2, it follows from Theorem |3.1| that the Laakso graphs have 



metric cotype 2 (with the constant C in (39) taken to be independent 
of k). We do not know if all doubling metric spaces have metric cotype 
2. The Heisenberg group is a prime example for which this question 
remains open. Note that the Heisenberg group does not embed into 
any Li{fi) space [H]. Therefore the above reasoning for the Laakso 
graphs does not apply to the Heisenberg group. 

Metric trees and the Laakso graphs are nontrivial examples of planar 
graphs that have Markov type 2. This result of [149j was extended to all 
series parallel graphs in [37] . It was also shown in |149j that any planar 
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graph satisfies the weak Markov type 2 inequahty (44), and using |166] 
one can show that planar graphs have Enfio type 2. It remains open 
whether all planar graphs have Markov type 2. 

4.1. Lipschitz extension via Markov type and cotype. Here we 
explain Ball's original motivation for introducing Markov type. 

Ball also introduced in [15] a linear property of Banach spaces that 
he called Markov cotype 2, and he indicated a two-step definition that 
could be used to extend this notion to general metric spaces. Motivated 
by Ball's ideas, the following variant of his definition was introduced 
in |132j . A metric space {A4, (Im) has metric Markov cotype q G (0, oo) 
with constant C G (0, oo) if for every m, n G N, every n by n symmetric 



stochastic matrix A = 
1/1, ...,?/„ G satisfying 



and every xi. 



G Ai, there exist 



IL lilt 

=1 

-E^i dM{x.,x,y. (45) 

1=1 j=l \ t=0 / ij 



i=l j=l 



To better understand the meaning of (45), observe that the Markov 



type p condition for {M., dj^) implies that 

n n n n 

J2J2^^'^^^JdM{x^,x,r ^ MP^^a,,(i^(x„x,)^ (46) 
1=1 j=i 1=1 j=i 



Thus (45) aims to reverse the direction of the inequality in (46), with 



the following changes. One is allowed to pass from the initial points 
G Ai to new points . . . , ^ -M- The first summand in 



the left hand side of (45) ensures that on average yi is close to Xj. The 



remaining terms in (45) correspond to the reversal of (46), with {x,}"^^ 
replaced by {yi}^^^ in the left hand side, and the power A"^ replaced 
by the Cesaro average ^ St^o^ 



Although (45) was inspired by Ball's ideas, the formal relation be- 



tween the above definition of metric Markov cotype and Ball's original 
definition in [15] is unclear. We chose to work with the above definition 
since it suffices for the purpose of Ball's original application, and in ad- 
dition it can be used for other purposes. Specifically, metric Markov 
cotype is key to the development of calculus for nonlinear spectral gaps 
and the construction of super-expanders; an aspect of the Ribe program 
that we will not describe here for lack of space (see |132] ). 
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For q E [1, oo), a metric space {A4, (Im) is called VF^-barycentric with 
constant F G (0, oo) if for every finitely supported probability measure 
/i on there exists a point G (a barycenter of ji) such that 
/S^^ = X for all X G X and for every two finitely supported probability 
measures /x, p we have dMi/^fj,, Pu) ^ ^Wq{^, u), where Wq{-, ■) denotes 
the g-Wasserstein metric (see |187t Sec. 7.1]). Note that by convexity 
every Banach space is VF^-barycentric with constant 1. 

The following theorem from |135j is a metric space variant of Ball's 
Lipschitz extension theorem [12] (the proof follows the same ideas as 
in [153 with some technical differences of lesser importance). 



Theorem 4.1. Fix q G (0, 00) and let (A^, rf^), {Af, dj\f) be two metric 
spaces. Assume that Ai has Markov type q with constant M and M 
has metric Markov cotype q with constant C . Assume also that M is 
Wq-hary centric with constant F. Then for every A M., every finite 
S* C \ A, and every Lipschitz mapping / : A — > N there exists 
F : A U S* — )■ N satisfying F{x) = f{x) for all x E A and 



ll-^llup ^r,j\/,c \\J llLip, 
where the implied constant depends only on F, M, C. 



Ball proved Theorem 4.1| when A/" is a Banach space, q = 2, and 



the metric Markov cotype assumption is replaced by his linear notion 
of Markov cotype. He proved that every Banach space that admits 
an equivalent norm with modulus of uniform convexity of power type 
2 satisfies his notion of Markov cotype 2. In combination with |149j . 
it follows that the conclusion of Theorem 14.11 holds true if Al is a 
Banach space that admits an equivalent norm with modulus of uniform 
smoothness of power type 2 and A/" is a Banach space that admits 
an equivalent norm with modulus of uniform convexity of power type 
2. In particular, for l<g^2^p<cxDwe can take Ai = ip 
and A/" = iq. This answers positively a 1983 conjecture of Johnson 
and Lindenstrauss [88]. The motivation of the question of Johnson 
and Lindenstrauss belongs to the Ribe program (see also |115j ): to 
obtain a metric analogue of a classical theorem of Maurey |12U] that 
implies this result for linear operators, i.e., in Maurey's setting A is 
a closed linear subspace and / is a linear operator, in which case the 
conclusion is that F : A^ — )■ A^ is a bounded linear operator with 
11-^11 ~A/,c 11/11- Examples of applications of Ball's extension theorem 
can be found in |144l I126j . 

In [135] it is shown that if (X, || • ||x) is a Banach space that admits 
an equivalent norm with modulus of uniform convexity of power type 
q then it has metric Markov cotype q. Also, it is shown in |135] that 
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certain barycentric metric spaces have metric Markov cotype g; this is 
true in particular for CAT{0) spaces, and hence also all simply con- 
nected manifolds of nonpositive sectional curvature (see |3^). These 
facts, in conjunction with Theorem 4.1| yield new Lipschitz extension 



theorems; see [135J . 

For Banach spaces the notion of metric Markov cotype q does not 
coincide with Rademacher cotype: one can deduce from a clever con- 
struction of Kalton [^5] that there exists a closed linear subspace X 
of Li (hence X has Rademacher cotype 2) that does not have met- 
ric Markov cotype q for any q < oo. The following natural question 
remains open. 

Question 6. Does ii have metric Markov cotype 2? 



By Theorem 4. 1 , a positive solution of Question [6] would answer a 



well known question of Ball ^b\, by showing that every Lipschitz func- 
tion from a subset of £2 to ii can be extended to a Lipschitz function 
defined on all of £2- See pj,4] for ramifications of this question in the- 
oretical computer science. 

5. Markov convexity 

Deep work of James [SU |85] and Enfio [5T| implies that a Banach 
space {X, II • \\x) admits an equivalent uniformly convex norm if and 
only if it admits an equivalent uniformly smooth norm, and these prop- 
erties are equivalent to the assertion that any Banach space {Y, \\ ■ ||y) 
that is finitely representable in X must be reflexive. Such spaces are 
called superreflexive Banach spaces. The Ribe program suggests that 
superrefiexivity has a purely metric reformulation. This is indeed the 
case, as proved by Bourgain |32] . 

For fc, G N let denote the complete /c-regular tree of depth n, 
i.e., the finite unweighted rooted tree such that the length of any root- 
leaf path equals n and every non-leaf vertex has exactly k adjacent 
vertices. We shall always assume that is equipped with the shortest 
path metric dj'ki^-, ■), i.e., the distance between any two vertices is the 
sum of their distances to their least common ancestor. Bourgain's 
characterization of superrefiexivity |32] asserts that a Banach space 
(X, II ■ II x) admits an equivalent uniformly convex norm if and only if 
for all ^ 3 we have 

lim cxiT^) = 00. (47) 

n— ^00 

Bourgain's proof also yields the following asymptotic computation of 
the Euclidean distortion of T^: 

A; ^ 3 ^ X v1^. (48) 
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All known proofs of the lower bound (T^) ^ ^/[ogn are non-trivial 
(in addition to the original proof of [32], alternative proofs appeared 



in \in\ \113\ llU4j ). In this section we will describe a proof of (48) from 
the viewpoint of random walks. 

It is a nontrivial consequence of the work of Pisier |162j that the 
Banach space property of admitting an equivalent norm whose modu- 
lus of uniform convexity has power type p is an isomorphic local linear 
property. As such, the Ribe program calls for a purely metric reformu- 
lation of this property. Since Pisier proved |162j that a Banach space 
is superrrefiexive if and only if it admits an equivalent norm whose 
modulus of uniform convexity has power type p for some p G [2,oo), 
this question should be viewed as asking for a quantitative refinement 
of Bourgain's metric characterization of superrefiexivity. 

The following definition is due to |104] . Let {Zt}tez be a Markov 
chain on a state space fl. Given integers k,s ^ 0, denote by {Zt{s)}tez 
the process that equals Zt for time t ^ s, and evolves independently 
(with respect to the same transition probabilities) for time t > s. Fix 
p > 0. A metric space (A^,(i^) is called Markov j9-convex with con- 
stant n if for every Markov chain {Zt}t& on a state space f2, and every 
mapping f : Vt ^ M., 



- E[d^(/(Z0,/(z,(t-2 



PI 



^UP-J2KdM{f{Zt),f{Zt-i)r]. (49) 



The infimum over those 11 G [0, oo] for which (49) holds for all Markov 
chains is called the Markov p-convexity constant of Ai, and is denoted 
Ilp{M). We say that {M, (Im) is Markov p-convex if Iip{M) < oo. 



We will see in a moment how to work with (49), but we first state 
the following theorem, which constitutes a completion of the step of 
the Ribe program that corresponds to the Banach space property of 
admitting an equivalent norm whose modulus of uniform convexity has 
power type p. The "only if part of this statement is due to [104] and 
the "if part is due to [T29] . 

Theorem 5.1. Fix p G [2,oo). A Banach space {X, \\ ■ \\x) admits an 
equivalent norm whose modulus of uniform convexity has power type p 
if and only if (X, || ■ ||x) is Markov p-convex. 



The meaning of (49) will become clearer once we examine the fol- 



lowing example. Fix an integer k ^ 3 and let {Zt}t& be the following 
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Markov chain whose state space is T^. Zt equals the root of for t < 0, 
and {ZfjtgN is the standard outward random walk (i.e., if ^ t < n 
then Zt+i is distributed uniformly over the k—1 neighbors of Zt that are 
further away from the root than Zt), with absorbing states at the leaves. 
Suppose that {A4, dM) is a metric space that is Markov p-convex with 
constant 11, and for some A, D G (0, oo) we are given an embedding 
f : M. that satisfies Xdrpk{x,y) ^ d_M{f{x),f{y)) ^ D\d'pk{x,y) 

for all x,y & T^. For every s, t G N such that 2"^ ^ t ^ n, with proba- 
bility at least 1 — 1/(A; — 1) the vertices Zt_2«+i and Zt-2''+iit — 2*) are 
distinct, in which case dxk{Zt, Zt{t — 2^*)) = 2*+-'^. It therefore follows 
from (|49]) that 

- E [d^ (fiZt), f (Zt (t - r))Y 



2sp 

s=0 t& 

^ W ■ Y,^[dM{f{Zt), f{Zt.i)r] ^ WD^X^n. 

Consequently, 



In particular, when = £2 this explains (48). 

A different choice of Markov chain can be used in combination with 
Markov convexity to compute the asymptotic behavior of the Euclidean 
distortion of the lamplighter group over Z„; see |104l [T^ . Similar rea- 
soning also applies to the Laakso graphs {Gk}'kLo, as depicted in Fig- 
ure |4j In this case let {Z^j^Q be the Markov chain that starts at the 
leftmost vertex of Gk (see Figure |4]), and at each step moves to the 
right. If Zt is a vertex of degree 3 then Zt+i equals one of the two 
vertices on the right of Zt, each with probability |. An argument along 
the above lines (see |129[ Sec. 3]) yields 

This estimate is sharp when A4 = ig for all g G (1, 00). Note that since 
the Laakso graphs are doubling, they do not contain bi-Lipschitz copies 
of with distortion bounded independently of n. Thus the Markov 
convexity invariant applies equally well to trees and Laakso graphs, 
despite the fact that these examples are very different from each other 
as metric spaces. Recently Johnson and Schechtman [90] proved that 
if for a Banach space {X, \\ ■ \\x) we have limk^oo cx{Gk) = 00 then 
X is superreflexive. Thus the nonembeddability of the Laakso graphs 
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is a metric characterization of superreflexivity that is different from 
Bourgain's characterization (47). 

In addition to uniformly convex Banach spaces, other classes of 
metric spaces for which Markov convexity has been computed include 
Alexandrov spaces of nonnegative curvature [12j (they are Markov 2- 
convex) and the Heisenberg group (it is Markov 4-convex, as shown 
by Sean Li). Markov convexity has several applications to metric ge- 
ometry, including a characterization of tree metrics that admit a bi- 
Lipschitz embedding into Euclidean space |1U4]. a polynomial time 
approximation algorithm to compute the distortion of tree met- 
rics |104j . and applications to the theory of Lipschitz quotients |129] . 



6. Metric smoothness? 

Since a Banach space admits an equivalent uniformly convex norm 
if and only if it admits an equivalent uniformly smooth norm, Bour- 
gain's characterization of superreflexivity implies that, for every ^ 3, 
a Banach space X admits an equivalent uniformly smooth norm if and 
only if lim„_>.oo cx(T^) = oo. Nevertheless, a subtlety of this problem 
appears if one is interested in equivalent norms whose modulus of uni- 
form smoothness has a given power type. Specifically, a Banach space 
X admits an equivalent norm whose modulus of uniform smoothness 
has power type p if and only if X* admits an equivalent norm whose 
modulus of uniform convexity has power type p/{p — 1); this is an 



immediate consequence of (43). Despite this fact, and in contrast to 
Theorem 5.1, we do not know how to complete the Ribe program for 
the property of admitting an equivalent norm whose modulus of uni- 
form smoothness has power type p. The presence of Trees and Laakso 
graphs is a natural obstruction to uniform convexity, but it remains 
open to isolate a natural (and useful) family of metric spaces whose 
presence is an obstruction to uniform smoothness of power type p. 



7. Bourgain's discretization problem 

Let (X, II • ||x) and {Y, \\ ■ ||y) be normed spaces with unit balls 
Bx and By, respectively. For e G (0, 1) let 6x^y{£) be the supre- 
mum over those 6 G (0, 1) such that every 6-net Ms in Bx satisfies 
cyiMs) ^ (1 — £)cy{X). 6x^y{-) is called the discretization modulus 
corresponding to X, Y. Ribe's theorem follows from the assertion that 
if dim(X) < oo then 6x^y{£) > for all e G (0, 1). This implication 
follows from the classical observation |44j that uniformly continuous 
mappings on Banach spaces are bi-Lipschitz for large distances, and a 



AN INTRODUCTION TO THE RIBE PROGRAM 



33 



w* differentiation argument of Heinrich and Mankiewicz [77] ; see 
for the details. 

In [33] Bourgain found a new proof of Ribe's theorem that furnished 
an exphcit bound on Sx^y{-)- Specifically, if dim(X) = n then 



V£G(0,1), 5x^y(£) ^ e-("/^)^-, (50) 



Cn 



where C G (0, oo) is a universal constant. (50) should be viewed as a 
quantitative version of Ribe's theorem, and it yields an abstract and 
generic way to obtain a family of finite metric spaces that serve as ob- 
structions whose presence characterizes the failure of any given isomor- 
phic finite dimensional linear property of Banach spaces; see |157[ 1159] . 

In light of the Ribe program it would be of great interest to deter- 
mine the asymptotic behavior in n of, say, 5x--s>y(l/2). However, the 



bound (50) remains the best known estimate, while the known (sim- 
ple) upper bounds on (5x--s.r(l/2) decay like a power of n; see [M] . 
This question is of interest even when X, Y are restricted to certain 
subclasses of Banach spaces, in which the following improvement is 
known [6l]: for all p G [l,oo) we have 6x^Lp{^l){^/'2) > (dim(X))"^/2 
(the implied constant is universal). We refer to ^1] for a more gen- 
eral statement along these lines, as well as to |105t |H2] for alternative 
approaches to this question. 



8. Nonlinear Dvoretzky theorems 

A classical theorem of Dvoretzky [17] asserts, in confirmation of a 
conjecture of Grothendieck [70], that for every G N and D > 1 there 
exists n = n{k, D) G N such that every n-dimensional normed space has 
a /c- dimensional linear subspace that embeds into Hilbert space with 
distortion D; see |1391I138[I174] for the best known bounds on n{k, D). 
In accordance with the Ribe program, Bourgain, Figiel and Milman 
asked in 1986 if there is an analogue of the Dvoretzky phenomenon 
which holds for general metric spaces. Specifically, they investigated 
the largest m G N such that any finite metric space (A^, (Im) of cardi- 
nality n has a subset 5* C with \S\^ m such that the metric space 
{S,d_M) embeds with distortion D into Hilbert space. Twenty years 
later, Tao asked an analogous question in terms of Hausdorff dimen- 
sion: given a > and D > 1, what is the supremum over those /3 ^ 
such that every compact metric space Ai with dim//(A^) ^ a has a 
subset S C M. with dimj7(A^) ^ f3 that embeds into Hilbert space 
with distortion D7 Here dimj:^(-) denotes Hausdorff dimension. 

A pleasing aspect of the Ribe program is that sometimes we get more 
than we asked for. In our case, we asked for almost Euchdean subsets. 
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but the known answers to the above questions actually provide subsets 
that are even more structured: they are approximately ultrametric. 
Before describing these answers to the above questions, we therefore 
first discuss the structure of ultrametric spaces, since this additional 
structure is crucial for a variety of applications. 

8.1. The structure of ultrametric spaces. Let {Ai, (Im) be an ul- 
trametric space, i.e., 

Wx,y,zeM, dM{x,y) ^max{dM{x,z),dM{y,z)} . (51) 

In the discussion below, assume for simplicity that Ai is finite: this 
case contains all the essential ideas, and the natural extensions to in- 
finite ultrametric spaces can be found in e.g. [HD 11341 [55] . Define an 
equivalence relation ~ on by 

^x,yeM, X y <^==^ (i_yK(x, y) < diam(A^) = max dj^yz^w). 



Observe that it is the ultra-triangle inequality (51) that makes ~ be 
indeed an equivalence relation. Let Ai, . . . ,Ak be the corresponding 
equivalence classes. Thus dM{x,y) < diam(A^) if {x,y) G IJi=i ^« ^ ^« 
and dM{x,y) = diam(A^) if {x,y) G \ ULi ^« ^ 

By applying this construction to each equivalence class separately, 
and iterating, one obtains a sequence of partitions • • • ; of M. 
such that ^0 = {A4}, Si^n = \X^Y\x<^Mi and ^j+i is a refinement 
of for all 2 G {0, ...,n — 1}. Moreover, for every x,?/ G A^, if 
we let i G {0, . . . , n} be the maximal index such that x,y & A for 
some A G then dj^i^^y) = diam(74). Alternatively, consider the 
following graph-theoretical tree whose vertices are labeled by subsets 
of Al. The root is labeled by Al and the ith level of the tree is in 
one-to-one correspondence with the elements of the partition The 
descendants of an i level vertex whose label is A G are declared to 
be the i + 1 level vertices whose labels are {B G ^i+i : B C A}. With 
this combinatorial picture in mind, Al can be identified as the leaves of 
the tree and the metric on Al has the following simple description: the 
distance between any two leaves is the diameter of the set corresponding 
to their least common ancestor in the tree. This simple combinatorial 
structure of ultrametric spaces will be harnessed extensively in the 
ensuing discussion. See [HH I134[ 198] for an extension of this picture to 
infinite compact ultrametric spaces (in which case the points of Al are 
in one-to-one correspondence with the ends of an infinite tree). 

We record two more consequences of the above discussion. First of 
all, by considering the natural lexicographical order that is induced on 
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the leaves of the tree, we obtain a hnear order -< on Ai such that if 
x,y ^ satisfy x ^ y then 

diam([x, y]) = diam({2; E Ai : x ^ z ^ y}) = dM{x,y). (52) 

See [98] for a proof of the existence of a hnear order satisfying (52) 
for every compact ultrametric space {Ai, dj^), in which case the order 
interval [x,y] is always a Borel subsets of A^. 

The second consequence that we wish to record here is that {Ai, dj^) 
admits an isometric embedding into the sphere of radius diam(A^)/A/2 
of Hilbert space. This is easily proved by induction on A\ as follows. 
Letting Ai, . . . , be the equivalence classes as above, by the induc- 
tion hypothesis there exist isometric embeddings fi : Ai ^ Hi, where 
Hi, . . . , Hk are Hilbert spaces and ||/i(a;)||Hi = di&m{Ai) / ^/2 for all 
i G {1, . . . , k}. Now define 




f:M^\ ^H, 1 ©4 = if 



by 



, „, , „, , /diam(A^)2 - diam(y4j)2 



where ei, . . . , Cfc is the standard basis of £3 = (M'', II " lb)- One deduces 
directly from this definition, and the fact that dM{.x,y) = diam(A^) if 
{x,y) G AlxULi AixAj, that ||/(x)||h = diam(>l)/V5 for aU a; G M 
and 11/(2;) — f{y)\\H = d_M{x,y) for all x,y G Ai. See |186j for more 
information on Hilbertian isometric embeddings of ultrametric spaces. 

Thus, the reader should keep the following picture in mind when 
considering a finite ultrametric space {A4, dj^)- it corresponds to the 
leaves of a tree that are isometrically embedded in Hilbert space. More- 
over, for every node of the tree the distinct subtrees that are rooted at 
its children are, after translation, mutually orthogonal. 



8.2. Ultrametric spaces are ubiquitous. The following theorem is 
equivalent to the main result of |136] , the original formulation of which 
will not be stated here; the formulation below is due to |134j . 

Theorem 8.1 (Ultrametric skeleton theorem). For every e G (0,1) 
there exists Ce G [1, C)o) with the following property. Let {Ai, dM) o 
compact metric space and let fi be a Borel probability measure on Ai. 
Then there exists a compact subset S" C and a Borel probability 
measure v that is supported on S, such that {S,dM) embeds into an 
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ultmmetric space with distortion at most 9/e and 

V(x,r) G X [0,cx)), v{B{x,r)nS) ^{ii{B{x,Cer))f'' . (53) 



The subset S" C of Theorem |8.1| is called an ultrametric skeleton 
of A4 since, as we shall see below and is explained further in ^134j . it 
must be "large" and "spread out" , and, more importantly, its main use 
is to deduce global information about the initial metric space {A4, dj^)- 



By Theorem |8.1| we know that despite the fact that ultrametric 
spaces have a very restricted structure, every metric measure space has 
an ultrametric skeleton. We will now describe several consequences 
of this fact. Additional examples of consequences of Theorem |8.1| are 
contained in Sections |9.1[ |9.2[ |9.3| below. 



Our first order of business is to relate Theorem 18.11 to the above 
nonlinear Dvoretzky problems. 



Theorem |8.1| was discovered in the 
context of investigations on nonlinear Dvoretzky theory, and as such it 
constitutes another example of a metric space phenomenon that was 
uncovered due to the Ribe program. 

Theorem 8.2. For every e G (0, 1) and n G N, any n-point metric 
space has a subset of size at least n^~'^ that embeds into an ultrametric 
space with distortion 0{l/e). 

Proof. This is a simple corollary of the ultrametric skeleton theorem, 
which does not use its full force. Specifically, right now we will only 
care about the case r = in (53), though later we will need (53) in its 
entirety. So, let {M.,dM) be an ra-point metric space and let /i be the 



uniform probability measure on M.. An application of Theorem 8.1 



to the metric measure space (A4,(i_A/(,/i) yields an ultrametric skeleton 
(S", v). Thus (S", dM) embeds into an ultrametric space with distortion 
0{l/e). Since z/ is a probability measure that is supported on S", there 
must exist a point x G S* with ^{{x}) ^ 1/|5'|. By (53) (with r = 0) 
we have ^{{x}) ^ iJi{{x}Y^^ = l/n^~^. Thus \S\ ^ n^^, □ 



Theorem 8.2 



tigations in [31 



was first proved in |127j . as a culmination of the inves- 



, EH EOl Eni |22] . The best known bound for this problem 
is due to |152] . where it is shown that if £ G (0, 1) then any n-point 
metric space has a subset of size n^~^ that embeds into an ultrametric 
space with distortion at most 

m = -r^^- (54) 
eil - e) e 



Theorem 8.2 belongs to the nonlinear Dvoretzky framework of Bour- 
gain, Figiel and Milman because we have seen that ultrametric spaces 
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admit an isometric embedding into Hilbert space. Moreover, the fol- 
lowing matching impossibility result was proved in |22j . 

Theorem 8.3. There exist universal constants K, k, & (0, oo) and for 

every n G N there exists an n-point metric space {J^n,dMn) ^'wc/i that 
for every e G (0, 1) we have 

C Mn, \S\ ^ Kn'-' =^ c,,{S,dM„) ^ ^. 



rem 



In addition to showing that Theorem 
is asymptotically sharp, Theorem 



8^ 
8.3 



and hence also Theo- 
establishes that, in gen- 
eral, the best way (up to constant factors) to find a large approximately 
Euclidean subset is to actually find a subset satisfying the more strin- 
gent requirement of being almost ultrametric. 

Turning to the Hausdorff dimensional nonlinear Dvoretzky problem, 
we have the following consequence of the ultrametric skeleton theorem 
due to |136j . 

Theorem 8.4. For every e G (0,1) and a G (0,oo), any compact 
metric space of Hausdorff dimension greater than a has a closed sub- 
set of Hausdorff dimension greater than (1 — e)a that embeds into an 
ultrametric space with distortion 0{l/e). 

Proof. Let (A^,c/^) be a compact metric space with dim//(A^) > a. 
By the Frostman lemma (see [801 1119] ) it follows that there exists a 
Borel probability measure fi on Ai and K G (0, oo) such that 

V(x, r) eMx [0, oo), fiiB{x, r)) ^ fsTr". (55) 



An application of Theorem 8.1 to the metric measure space {M., dj^, fi) 
yields an ultrametric skeleton (5, z/). If {-B(xj, rj)}^^ is a collection of 
balls that covers 5* then 

/ oo \ oo 

1 = u{S) = i^[\J B{x,, r,) <: J2 ^iB{x„ r,)) 

\i=l J 1=1 

too 

1=1 i=l 

Having obtained an absolute positive lower bound on X^i^i '"i^ 

all the covers of S by balls {i?(xi, rj)}^^, we conclude the desired 

dimension lower bound dimj:i-(S') ^ (1 — e)a. □ 

Remark 8.1. It is also proved in |136] that there is a universal constant 
K G (0, oo) such that for every a > and e G (0, 1) there exists a 
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compact metric space {M.,dM) with dimi^(A^) = a such that 

yS CM, dimj^(5) ^ {l-e)a =^ ci,{S,dM) ^ ^• 

Therefore, as in the case of the nonhnear Dvoretzky problem for fi- 
nite metric spaces, the question of finding in a general metric space a 
high- dimensional subset which is approximately Euclidean is the same 
(up to constants) as the question of finding a high-dimensional subset 
which is approximately an ultrametric space. This phenomenon helps 
explain how investigations that originated in Dvoretzky's theorem led 
to a theorem such as 18.11 whose conclusion seems to be far from its 
initial Banach space motivation: the Ribe program indicated a natural 
question to ask, but the answer itself turned out to be a truly nonlin- 
ear phenomenon involving subsets which are approximately ultrametric 
spaces; a (perhaps unexpected) additional feature that is more useful 
than just the extraction of approximately Euclidean subsets. 



Remark 8.2. As mentioned above, the best known distortion bound in 
Theorem 8.2 is given in (54). When e — )■ 1 this bound tends to 2 from 
above. Distortion 2 is indeed a barrier here: the nonlinear Dvoretzky 
problem exhibits a phase transition at distortion 2 between power-type 
and logarithmic behavior of the largest Euclidean subset that can be ex- 
tracted in general metric spaces of cardinality n. This phenomenon was 
discovered in [22] ; see also [211 ESI IM] for related threshold phenomena. 
In their original paper [3l] that introduced the nonlinear Dvoretzky 
problem, Bourgain Figiel and Milman proved that for every D > 1 any 
n-point metric space has a subset of size at least c{D) log n that em- 
beds with distortion D into Hilbert space. They also proved that there 
exists constants Dq = 1.023..., k G (0, oo) and for every n G N there 
exists an n-point metric space {Ain,dM„) such that every S C Jli^ 
with 15*1 ^ Klogn satisfies ce2{S,dM„) ^ Dq. In [22] this impossibility 
result was extended to any distortion in (1,2), thus establishing the 
above phase transition phenomenon. The asymptotic behavior of the 
nonlinear Dvoretzky problem at distortion D = 2 remains unknown. 
For the Hausdorff dimensional version of this question the phase tran- 
sition at distortion 2 becomes more extreme: for every 5 G (0, 1/2) one 
can obtain [136] a version of Theorem 8.4 with the resulting subset S 
having ultrametric distortion 2 + 5 and dimniS) > iog{i/5) '^- 
trast, for every a G (0, oo) there exists [136] a compact metric space 
{M,dM) of Hausdorff dimension a such that if S' C embeds into 
Hilbert space with distortion strictly smaller than 2 then dimj^(S') = 0. 
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9. Examples of applications 



Several applications of the Ribe program have already been discussed 
throughout this article. In this section we describe some additional ap- 
plications of this type. We purposefully chose examples of applications 
to areas which are far from Banach space theory, as an indication of 
the relevance of the Ribe program to a variety of fields. 

9.1. Majorizing measures. A (centered) Gaussian process is a fam- 
ily of random variables {Gx}xex, where X is an abstract index set 
and for every Xi, . . . ,Xn G X and Si, . . . , s„ G M the random vari- 
able Yyi=i SiGxi is a mean zero Gaussian random variable. To avoid 
technicalities that will obscure the key geometric ideas we will assume 
throughout the ensuing discussion that X is finite. 

Given a centered Gaussian process {Gx}xex, it is of great interest 
to compute (or estimate up to constants) the quantity E [max^jgx Gx]- 
The process induces the metric d{x, y) = a/E [{Gx — Gy)"^] on X, and 
this metric determines E [max^gx Gx^] . Indeed, if X = 
then consider the n by n matrix D = {d{xi, xj^) and observe that D is 
negative semidefinite on the subspace G M" : J27=i = ^} of M". 
Then, 



E 



max Gxi 

i£{l,...,n} 



max 



(y^x 



More importantly, E [max^gx Gx] is well-behaved under bi-Lipschitz 
deformations of {X,d): by the classical Slepian lemma (see e.g. [HTf 
1178] ). if {Gx}xex and {Hx}xex are Gaussian processes satisfying 



aJE [{Gx - Gy)^] ^ JE [{Hx - Hyf] <: 13 JE [{Gx - G 



for all X, y G X, then 



aE 



max G... 

xdX 



■C E 



maxiJ^. 

xdX 



^ /3E 



max Gt 

xdX 



These facts suggest that one could "read" the value of E [max^ex Gx] 
(up to universal constant factors) from the geometry of the metric space 
(X, (i). How to do this explicitly has been a long standing mystery 
until Talagrand proved |178] in 1987 his celebrated majorizing measure 
theorem, which solved this question and, based on his investigations 
over the ensuing two decades, led to a systematic geometric method 
to estimate E [max^gx G^ , with many important applications (see the 
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books |103[ 11801 1181] and the references therein). We will now explain 
the majorizing measure theorem itself, and how it is a consequence of 
the ultrametric skeleton theorem; this deduction is due to |134] . 

For a finite metric space {X, d) let Prob(X) denote the space of all 
probability measures on X. Consider the quantity 

72(X,d)= inf sup 

AteProb(X) x(ix 

The parameter 72(X, ci) should be viewed as a Gaussian version of 
a covering number. Indeed, the integral ^/log (l//i(i?(x, r)))dr 
is large if has a small amount of mass near x, so 72(X, d) mea- 
sures the extent to which one can spread unit mass over X so that 
all the points are "close" to this mass distribution in the sense that 
maxj,.gx Jq°° \/^og {1/ fi{B{x, r)))dr is as small as possible. 

Fernique introduced 72(X, c?) in [57], where he proved that every 
Gaussian process {G^jxex satisfies E [sup^g^^^ G^;] < 72(X, (i). Un- 
der additional assumptions, he also obtained a matching lower bound 
E [sup^jgjf Gi:] > 72(X, d). Notably, Fernique proved in 1975 (see [55] 
and also [59^ Thm. 1.2]) that if the metric d{x,y) = a/E [{Gx — Gy)"^] 
happens to be an ultrametric then E [sup^gj,^ G^^] x 72(X, c/). By the 
Slepian lemma, the same conclusion holds true also if (X, d) embeds 
with distortion 0(1) into an ultrametric space. 

It is simple to see how the ultrametric structure is relevant to such 



probabilistic considerations: in Section |8.1| we explained that an ul- 
trametric space can be represented as a subset of Hilbert space cor- 
responding to leaves of a tree in which the subtrees rooted at a given 
vertex are mutually orthogonal. In the setting of Gaussian processes 
orthogonality is equivalent to (stochastic) independence, so the geo- 
metric assumption of ultrametricity in fact has strong probabilistic 
ramifications. Specifically, the problem reduces to the estimation of 
the expected supremum of the following special type of Gaussian pro- 
cess, indexed by leaves of a graph theoretical tree T = (V, E): to each 
edge e G E{T) we associated a mean zero Gaussian random variable 
He, the variables {He}e&E are independent, and for every leaf x we have 
E{Pa:) -^e' 'W'here is the unique path joining x and the root 
of T. This additional independence that the ultrametric structure pro- 
vides allowed Fernique to directly prove that E [sup^.^^^^ Gx] ^ 72 (X, d). 

Due in part to the above evidence, Fernique conjectured in 1974 
that E [sup^gjj^ Ga;] X 72(X, (i) for every Gaussian process {Gx}x<^x- 
Talagrand's majorizing measure theorem [T78 J is the positive reso- 
lution of this conjecture. By Fernique's work as described above. 
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this amounts to the assertion that E [sup^g^ G^x] ^ l2{X,d) for ev- 
ery Gaussian process {Gx}x£X- Talagrand's strategy was to show that 
there is S* C X that embeds into an ultrametric space with distor- 
tion 0(1), and 72(5", c?) > 72(X, c?). It would then follow from Fer- 
nique's original proof of the majorizing measure theorem for ultramet- 
ric spaces that E [sup^.^^ G^;] > 72(5", (i) > 72(X, rf). Since trivially 
E [sup^jgjf Gx] ^ E [sup^g5 Gx], this strategy will indeed prove the ma- 
jorizing measures theorem. 

Consider the following quantity 

52 (X, d) = sup 

/iGProb(X) 

For the same reason that 72(X, rf) is in essence a Gaussian covering 
number, 62{X, d) should be viewed as a Gaussian version of a packing 
number. A short argument (see |134j ) shows that 62{X,d) x 72(X, rf) 
for every finite metric space {X, d) . 

Take fi G Prob(X) at which S2{X, d) is attained, i.e., for every x & X 
we have ^/log (l//i(i?(x, r)))dr ^ 52{X,d) . An application of the 
ultrametric skeleton theorem to the metric measure space (X, d, /x) 
with, say, e = 3/4, yields an ultrametric skeleton {S, u). Thus S' C X 
embeds into an ultrametric space with distortion 0(1) and i' e Prob(5') 
satisfies v{B[x,r)) ^ ^ fi{B{x, Cr)) for all x G X and r > 0, where 
> is a universal constant. It follows that for every x E S the inte- 
gral i/log {l/i'{B{x,r)))dr is at least ^ ^/log (l/yu(i?(x, Cr)))dr, 
which by a change of variable equals 1^ -^/log {1/ fi{B{x, r)))dr. But 
i/log (l/z/(i?(x, r)))dr > 52{X, d) by our choice of /i. By the def- 
inition of 52{S,d) we have 52{S,d) ^ ^J log {1 / i'{B{x ,r)))dr , so 
52{S,d) > 52{X,d). Since 52{-) x 72(-); the proof is complete. 

Remark 9.1. The use of ultrametric constructions in metric spaces in 
order to prove maximal inequalities is a powerful paradigm in analysis. 
The original work of Fernique and Talagrand on majorizing measures 
is a prime example of the success of such an approach, and methods 
related to (parts of the proof of) the ultrametric skeleton theorem have 
been used in the context of certain maximal inequalities in |13H |15 1 j . 
Other notable examples of related ideas include [131 IIHI I153[ [SS] . 

9.2. Lipschitz maps onto cubes. Keleti, Mathe and Zindulka [98] 
proved the following theorem using the nonlinear Dvoretzky theorem 



for Hausdorff dimension (Theorem 8.4), thus answering a question of 
Urbahski [TH1|. 
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Theorem 9.1. Fix n G N and let {Ai, (Im) be a compact metric space 
of Hausdorff dimension bigger than n. Then there exists a Lipschitz 
mapping from M. onto the cube [0, 1]"". 



If, in addition to the assumptions of Theorem 9.1 , (A^, dj^) is an ul 



trametric space, then Theorem 9.1 is proved as follows. By Frostman's 



lemma there exists a Borel probability measure on and K e (0, oo) 
such that ^ ii'(diam(y4))" for all Borel A ^ Ai. Moreover, as 

explained in Section |8.H there exists a linear order -< on satisfy- 



ing ([52]). Define ip : M ^ [0,1] by (p{x) = jj,{{y e M : y ~< x}). Then 
\ip{x) — ip{y)\ ^ Kd_M{x, yY for all x,y & X. Thus ip is continuous, and 
since /i is atom-free and Ai is compact, it follows that (p{Ai) = [0, 1]. 
Letting P be a 1/n-Holder Peano curve from [0,1] onto [0,1]" (see 
e.g. |173] ). the mapping f = P o p has the desired properties. 



To prove Theorem 9.1, start with a general compact metric space 



{M., dM) with dim/f (A^) > n. By Theorem 8.4 there exists a compact 
subset S* C Al with dim/f(S') > n that admits a bi-Lipschitz embed- 
ding into an ultrametric space. By the above reasoning there exists a 
Lipschitz mapping / from 5* onto [0, 1]". We now conclude the proof 



of Theorem 9.1 by extending / to a Lipschitz mapping F : Al — )■ [0, 1] 
(e.g. via the nonlinear Hahn-Banach theorem [2S1 Lem. 1.1]). 

The above reasoning exemplifies the role of ultrametric skeletons: S 
was used as a tool, but the conclusion makes no mention of ultramet- 
ric spaces. Moreover, S itself admits an n- Holder mapping onto [0, 1], 
something which is impossible to do for general M.. Only after com- 
position with a Peano curve do we get a Lipschitz mapping to which 
the nonlinear Hahn-Banach theorem applies, allowing us to deduce a 
theorem about Al with no mention of the ultrametric skeleton S. 



9.3. Approximate distance oracles and approximate ranking. 

Here we explain applications of nonlinear Dvoretzky theory to com- 
puter science. By choosing to discuss only a couple examples we are 
doing an injustice to the impact that the Ribe program has had on 
theoretical computer science. We refer to [ml ^ Eni [IIHl ESI fTS9] 
for a more thorough (but still partial) description of the role of ideas 
that are motivated by the Ribe program in approximation algorithms. 
Even if we only focus attention on nonlinear Dvoretzky theorems, the 
full picture is omitted below: Theorem 8.2 also yields the best known 
lower bound [201 122] on the competitive ratio of the randomized /c-server 
problem; a central question in the field of online algorithms. 

An n-point metric space (A, dx) is completely determined by the 
numbers {dx{x, y)}x,yex- One can therefore store (2) numbers, so that 
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when one is asked the distance between two points x, y G X it is possi- 
ble to output the number dx{x,y) in constant tim^. The approximate 
distance oracle problem asks for a way to store o(rr) numbers so that 
given (a distance query) x,y E X one can quickly output a number 
that is guaranteed to be within a prescribed factor of the true distance 
dx{x,y). The following theorem was proved in |127j as a consequence 



of the nonlinear Dvoretzky theorem 8.2 



Theorem 9.2. Fix D > 1. Every n-point metric space ({1, . . . ,n},c/) 
can he preprocessed in time O {n^) to yield a data structure of size 
Ql^j^i+oii/D)-^ so that given i,j G {l,...,n} one can output in 0(1) 
time a number E{i,j) that is guarantied to satisfy 

d{t,j)^E{i,j)^Dd{t,j). (56) 

Here, and in what follows, all the implied constants in the O(-) no- 
tation are universal constants. The preprocessing time of Theorem |9.2| 
is due to Mendel and Schwob |137] , improving over the original prepro- 
cessing time of 0{n'^~^^^^^^^) that was obtained in |127] . 

In their important paper |182j . Thorup and Zwick constructed ap- 



proximate distance oracles as in Theorem 9.2, but with query time 



0{D). Their preprocessing time is 0(r;,^), and the size of their data 



structure is 0(L)n^+2(i+o(i/s))/-D). The key feature of 9.2 is that it 



yields constant query time, i.e., a true oracle. In addition, the proof of 



Theorem |9.2| is via a new geometric method that we will sketch below, 
based on nonlinear Dvoretzky theory. 

Note that the exponent of n in the size of the Thorup-Zwick oracle 



is at most 1 + 2{1 + o{l))/D, while in Theorem |9^ it is 1 + C/D 
for some universal constant C (which can be shown to be at most 
20). This difference in constants can be important for applications, 
but recently Wulff-Nilsen proved |191j that one can use the oracle of 



Theorem 9.2 as a black box (irrespective of the constant C) to construct 
an oracle of size 0{n^~^'^^^~^^^^^) whose query time depends only on e. 
The significance of the constant 2 here is that |182] establishes that it 



For the sake of the discussion in this survey one should think of "time" as 
the number of locations in the data structure that are probed plus the number of 
arithmetic operations that are performed. "Size" refers to the number of floating 
point numbers that are stored. The computational model in which we will be work- 
ing is the RAM model, although weaker computational models such as the "Unit 
cost floating-point word RAM model" will suffice. See [73 1127j for a discussion of 
these computational issues. The preprocessing algorithms below are randomized, 
in which case "preprocessing time" refers to "expected preprocessing time" . All 
other algorithms are deterministic. 
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is sharp conditioned on the vahdity of a positive solution to a certain 
well-known combinatorial open question of Erdos |53] . 

Sommer, Verbin and Yu |177] have shown that Theorem |9.2| is sharp 
in the sense of the following lower bound in the cell-probe modej^ Any 
data structure that, given a query i,j G {1, . . . , n}, outputs in time t a 



number E{i,j) satisfying (56) must have size at least n^+^/*^*'^Y logn. 
This lower bound works even when the oracle's performance is mea- 
sured only on metric spaces corresponding to sparse graphs. The fact 



that the query time t of Theorem |9. 2 is a universal constant thus makes 



this theorem asymptotically sharp. Nonlinear Dvoretzky theory is the 
only currently known method that yields such sharp results. 



It turns out that the proof of Theorem 8.2 in |127] furnishes a ran- 
domized polynomial time algorithm that, given an n-point metric space 
{X,dx), outputs a subset 5 C X with \S\ ^ n^"^ such that {S,dx) 
embeds into an ultrametric space with distortion 0{l/e). Moreover, 
we can ensure that there exists an ultrametric p on X such that for 
every x E X and s G 5* we have dx{x,s) ^ p{x,s) ^ ^dx{x,s), 
where c G (0, oo) is a universal constant. The latter statement follows 
from the following general ultrametric extension lemma |127] . though 

in |127j actually establishes this fact directly 



9.3 below (this is important if one cares about 



the proof of Theorem 8.2 
without invoking Lemma 
constant factors). 

Lemma 9.3 (Extension lemma for approximate ultrametrics) . Let 
{X,dx) be a finite metric space and fix S ^ X and D ^ 1. Sup- 
pose that that po : S x S [0, oo) is an ultrametric on S satisfying 
dx{x,y) ^ po{x,y) ^ Ddx{x,y) for all x,y E S. Then there exists an 
ultrametric p : X x X — )■ [0, oo) such that p{x, y) = po{x, y) if x,y G S, 
for every x,y E X we have p{x,y) ^ dx{x,y)/3, and for every x G X 
and y E S we have p{x,y) ^ 2Ddx{x,y) . 



We are now in position to apply Theorem 8.2 iteratively as follows 



Set S'o = and let 5*1 C X be the subset whose existence is stipulated 
in Theorem |8.2[ Thus there exists an ultrametric pi on X satisfying 
dx{x,y) ^ pi{x,y) ^ ^dx{x,y) for all x G X and y G 5*1. Apply the 
same procedure to X \ 5*1, and continue inductively until the entire 
space X is exhausted. We obtain a partition {Si, . . . , Sm} of X with 
the following properties holding for every A; G {1, . . . , m}. 



^See |142) for more information on the cell probe computational model. It suffice 
to say here that it is a weak model, so cell probe lower bounds should be viewed as 
strong impossibility results. 



AN INTRODUCTION TO THE RIBE PROGRAM 



45 



There exists an ultrametric on X \ ljJ=o satisfying 



c 



dxix,y) ^ Pfc(x,t) ^ -dxix,y) 



e 



for all X G X \ Uj=o U ^ Sk- 



As we have seen in Section 8.1 , for every A; G {1, . . . , m} the ultramet- 
ric pk corresponds to a combinatorial tree whose leaves are X \ IJ^=o 
and each vertex of which is labeled by a nonegative number such 
that for x,y G X \ IJ^Zg 5^ the label of their least common ances- 
tor is exactly pk{x,y). A classical theorem of Harel and Tarjan [76] 
(see also [25) states that any X- vertex tree can be preprocessed in 
time 0{N) so as to yield a data structure of size 0{N) which, given 
two nodes as a query, returns their least common ancestor in time 
0(1). By applying the Harel- Tarjan data structure to each of the 
trees corresponding to pk we obtain an array of data structures (see 
Figure |4]) that can answer distance queries as follows. Given dis- 
tinct x,?/ G X let G {l,...,m} be the minimal index for which 
{x,y} n S'fe 7^ 0. Thus x,y G X \ {J^Z^Sj, and, using the Harel- 
Tarjan data structure corresponding to pk, output in 0(1) time the 
label of the least common ancestor of x, y in the tree corresponding to 
Pfc. This output equals pk{x,y), which, since {x,y} fl 5*^ 7^ 0, satisfies 
dx{x,y) ^ pk{x,t) ^ ^dx{x,y). Setting D = c/e and analyzing the 
size of the data structure thus obtained (using the recursion for the 



cardinality of Sk), yields Theorem 9.2 the details of this computation 
can be found in |127j . 

The ideas presented above are used in |127] to solve additional data 
structure problems. For example, we have the following theorem that 
addresses the approximate ranking problem, in which the goal is to 
compress the natural "n proximity orders" (or "rankings" ) induced on 
each of the points in an n-point metric space (i.e., each x G X orders 
the points of X by increasing distance from itself). 

Theorem 9.4. Fix D > 1, n G N and an n-point metric space (X, dx)- 
Then there exists a data structure which can be preprocessed in time 
O [Dn'^'^'^^^^^Hogn) , has size O (^Dn^'^'^^^^^^) , and supports the fol- 
lowing type of queries. Given x E X, have 'fast access" to a bisection 
TT*^^'' : {1, . . . , n} — )■ X satisfying 

VI ^ « < J < n, dx (x,vr(^)(0) ^ Ddx (x,vr(^)(j)) • 

By "fast access" to vr^^-' we mean that we can do the following in 0(1) 
time: 

(1) Given x G X and i G {1, . . . , n} output Ti^^\i). 
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Figure 4. In the approximate distance oracle 



problem an iterative application of Theorem 8.2 



yields an array of trees, which are then transformed 
into an array of Harel-Tarjan data structures. For 
the approximate ranking problem we also need to 
extend each tree to a tree whose leaves are the en- 



tire space X using Lemma 9^ The nodes that were 
added to these trees are illustrated by empty circles, 
and the dotted lines are their connections to the orig- 
inal tree. 



(2) Given x,u E X output j G {1, . . . ,n} satisfying = u 



The proof of Theorem 9.4 follows the same procedure as above, with 
the following differences: at each stag e we extend the ultrametric pk 
from X \ Uj=o to X using Lemma 9.3, and we replace the Harel- 



Tarjan data structure by a new data structure that is custom-made for 
the approximate ranking problem. The details are contained in |127] . 

9.4. Random walks and quantitative nonembeddability. While 
Ball introduced the notion of Markov type in order to investigate the 
Lipschitz extension problem, this notion has proved to be a versatile 
tool for the purpose of proving nonembeddability results. The use of 
Markov type in the context of embedding problems was introduced 
in \112\ . and this method has been subsequently developed in [221 1149[ 
[131 11471 1148] . Somewhat curiously, Markov type can also be used as 
a tool to prove Lipschitz non-extendability results; see |148j . Markov 
type is therefore a good example of the impact of ideas originating in 
the Ribe program on metric geometry. 

In this section we illustrate how one can use the notion of Markov 
type to reason that certain metric spaces must be significantly dis- 
torted in any embedding into certain Banach spaces. Since our goal 
here is to explain in the simplest possible terms this way of think- 
ing about nonembeddability, we will mostly deal with model problems, 
which might not necessarily be the most general/difficult/important 



AN INTRODUCTION TO THE RIBE PROGRAM 



47 



problems of this type. For example, we will almost always state our 
results for embeddings into Hilbert space, though it will be obvious 
how to extend our statements to general target spaces with Markov 
type p G (1,00). Also, we will present proofs in the case of finite 
graphs with large girth. While these geometric objects are somewhat 
exotic, they serve as a suitable model case for other spaces of inter- 
est, to which Markov type techniques also apply (e.g., certain Cayley 
graphs, including the discrete hypercube), since the large girth assump- 
tion simplifies the arguments, while preserving the essential ideas. We 
stress, however, that finite graphs with large girth are interesting geo- 
metric objects in their own right. Their existence is established with 
essentially complete freedom in the choice of certain governing param- 
eters (such as the girth and degree; see |172j ). yet understanding their 
geometry is difficult: this is illustrated by the fact that several basic 
problems on the embeddability properties of such graphs remain open. 
We will present some of these open problems later. 

Fix an integer ^ 3. Let G = {V,E) be an n-vertex fc-regular 
connected graph, equipped with its associated shortest path metric dc- 
Let g be the girth of G, i.e., the length of the shortest closed cycle in G. 
Fix an integer r < |. For any ball B of radius r in G, the metric space 
[B,dc) is isometric to {T^^drpk) (the tree is defined in Section [5|; 



see Figure [5j Thus Bourgain's lower bound (48) implies that 

Q.(G)>v^- (57) 



Can we do better than (57)? It seems reasonable to expect that 
we should be able to say more about the geometry of G than that it 
contains a large tree. When one tries to imagine what does a finite 
graph with large girth look like, one quickly realizes that it must be 
a complicated object: while it is true that small enough balls in such 
a graph are trees, these local trees must somehow be glued together 
to create a finite fc-regular graph. It seems natural to expect that the 
interaction between these local trees induces a geometry which is far 



more complicated than what is suggested by the lower bound (57). This 
question was raised in 1995 by Linial, London and Rabinovich 
Our ultimate goal is to argue that all large enough subsets of G must 
be significantly distorted when embedded into Hilbert space, but as a 
warmup we will start with an argument of |112] which shows how the 
fact that Hilbert space has Markov type 2 easily implies the following 
exponential improvement to (ISTl): 



Cfe(G)>v^. (5^ 
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To prove (58) we shall use the fact that G has large girth as fol- 
lows: it isn't only the case that G contains large trees, in fact every 
small enough ball in G is isometric to a tree. This information can be 
harnessed to our advantage as follows. Let {Zt}'^Q be the standard 
random walk on G, i.e., Zq is uniformly distributed on V and Zt+i 
conditioned on Zt is uniformly distributed on the fc-neighbors of Zj. 
Then {Zt\'^Q is a stationary reversible Markov chain on V . We claim 
that for every t < | — 1 we have 

¥.[dG{Zt,Z^)]>t. (59) 




Figure 5. A 3-regular graph with girth g. Balls of 
radius | look like trees in the sense that the distance 
of any vertex to the center of the ball is the same as 
the corresponding distance in a 3-regular tree rooted 
at the center of the ball. Any ball of radius | is 
isometric to a 3-regular tree. If we pick the center 
of the ball uniformly at random, and then perform a 
standard random walk, then up to time < |, at each 
step there is probability | to step further away from 
the center in the next step. 



The proof of (59) is simple. Zq is chosen uniformly among the ver- 
tices of G. But, once Zq has been chosen, the walk {^Zg : s<| — l} 
is simply the standard walk on a /c-regular tree starting from its root. 
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At each step of this walk, if 7^ then with probabihty 1 — I the 
vertex Z^+i is one of the k — \ neighbors of Zs which are further away 
from Zq than Z^, and with probabihty \ the vertex Zs^\ is the unique 
neighbor of Zg that hes on the (unique) path joining Z^. and Zq. If it 
happens to be the case that = Zq, then Z^+i is further away from Zq 
than Zg with probabihty 1. Since 1 — |; > |, we see that even though 
jZ^: s<| — ijisa stationary reversible Markov chain, in terms of 
the distance from Zq it is effectively a one dimensional random walk 



with positive drift, implying the required lower bound (59). 
Suppose that f : V ^ L2 satisfies 

Wx,ye V, dG{x,y) ^ \\f{x) - f{y)h <: Dddx^y). (60) 

Our goal is to bound D from below. The fact that Hilbert space has 
Markov type 2 implies that for all times t < | — 1 we have 

. ^ {E[dG{Z,,Zo)]f ^E[dG{Zt,Zof]X E[\\f{Zt) - f{Zo)\\l] 

f( |60l ) 
. tE [||/(Zi) - f{Zo)\\l] V tD'E [dciZuZof] = tD\ (61) 



Taking t x (7 in (61) yields (58). 



The above argument can be extended to the case when G is not 
necessarily a regular graph. All we need is that the average degree of 
G is greater than 2. Recall that the average degree of G is 



2\E\ 

x] 



\V\ ' \V\ ' 

where deg(j(x) denotes the number of edges in E emanating from x. 
Since we will soon be forced to deal with graphs of large girth which are 
not necessarily regular, we record here the following lemma from [22] : 

Lemma 9.5. Let G = (V, E) be a connected graph with girth g and 
average degree k. Then 

Ci,{G)>(l-l)^. (62) 



The proof of Lemma 9.5 follows the lines of the above proof of (58) 



with the following changes. For x G V define 

_ deggjx) _ degg(x) 

Now, let {Zt}^Q be the standard random walk on G, where Zq is dis- 
tributed on V according to the probability distribution n. Then {Zfj^Q 
is a stationary reversible Markov chain on V, so that the Markov type 
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2 inequality still applies to it. A short computation now yields (62); 
the details are contained in Theorem 6.1 of [22] . 

This type of use of random walks is quite flexible. For example, con- 
sider the case of the Hamming cube ({—1,1}", || ■ ||i). Let {Zj}^q be 
the standard random walk on { — 1,1}"', where Zq is distributed uni- 
formly on {—1, 1}". At each step, one of the n coordinates of Zt is 
chosen uniformly at random, and its sign is flipped. For t < | we have 
E [\\Zt — Zq\\i\ > t, since at each step with probability at least \ the co- 
ordinate being flipped has not been flipped in any previous step of the 
walk. As we have argued above, this implies that ({ — 1, 1}") ^ \/n. 
This lower bound is sharp up to the implied multiplicative constant; 
in fact, a classical result of Enflo |19] states that ({—1, 1}") = \/n. 
Enflo's proof of this fact uses a tensorization argument (i.e., induc- 
tion on dimension while relying on the product structure of the Ham- 
ming cube). Another proof [251 of Enflo's theorem can be deduced 
from a Fourier analytic argument (both known proofs of the equality 
Q2 ({ — 1,1}") = ^/n are nicely explained in the book |118] ). These 
proofs rely heavily on the structure of the Hamming cube, while, as we 
shall see below, the random walk proof that we presented here is more 
robust: e.g. it applies to negligibly small subsets of the Hamming cube 
which may be highly unstructured. 

Before passing to a more sophisticated application of Markov type, 
we recall the following interesting open question [112j . 

Question 7. Let C2{g) be the infimum of ci^{G) over all finite 3-regular 
connected graphs G with girth g. What is the growth rate of C2{g) as 
g 00? In particular, does C2{g) grow asymptotically faster than ^/g? 

In order to prove that limg^oo C2{g) / y/g = 00 (if true), we would 
need to use more about the structure of G than the fact that a ball 
of radius x g around each vertex is isometric to a 3-regular tree. One 
would need to understand the complicated regime in which these local 
trees interact. Our understanding of the geometry of these interactions 
is currently quite poor, which is why Question [7] is meaningful. On 
the other hand, if for arbitrarily large g E N there were 3-regular 
graphs G of girth g with (G) < ^/g, this would also have interesting 
consequences, as explained in [112j . Note that one could also ask a 
variant of Question [7| when g depends on the cardinality of V. The 
case g x log \ V\ is of particular importance (see [112j ). 

Letting ci{g) denote the infimum of ce^{G) over all finite 3-regular 
connected graphs G with girth g, it was also asked in [112j whether or 
not Ci{g) tends to 00 with g. This question was recently solved by Os- 
trovskii [158] , who showed that for arbitrarily large n G N there exists 
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a 3-regular graph Gn of girth at least a constant multiple of log log n 
yet Qj(G„) = 0(1). Since trees admit an isometric embedding into £i, 
such questions address the issue of how the local geometry of a metric 
space affects its global geometry (see [H |39l llOOt I167j for related in- 
vestigations along these lines). It remains an interesting open question 
whether there exist arbitrarily large graphs of logarithmic girth that 
admit a bi-Lipschitz embedding into ii, see |112j for ramifications of 
this question. 

9.4.1. Impossibility results for nonlinear Dvoretzky problems. Our goal 
here is to explain the relevance of Markov type techniques to proving 
impossibility results for nonlinear Dvoretzky problems, i.e., to show 
that certain metric spaces cannot have large subsets that well-embed 
into Hilbert space (or a metric space with nontrivial Markov type). 
Everything presented here is part of the investigation in [22] of the 
nonlinear Dvoretzky problem in concrete examples. Additional results 
of this type are contained in |22]- 

We have already seen that ci^ ({ — 1, 1}") ^ y/n. Assume now that we 
are given a subset SCj — 1,1}". If we only knew that the cardinality 
of S is large, would it then be possible to show that ce^{S) is also 
large? It is not clear how to proceed ii \ S\ = o(2") (this isn't clear even 
when IS" I is, say, one tenth of the cube). The random walk technique 
turns out to be robust enough to yield almost sharp bounds on the 
Euclidean distortion of a large subset of the Hamming cube, without 
any a priori assumption on the structure of the subset. Namely, it was 
proved in [22j that for every S'C{ — l,l}"'we have 



-^^(S)>J " (64) 
l + log(^) 



Thus, in particular, if \S\ = 2"(^-") = |{-1, 1}"|^"^ then ([64]) becomes 



C£2{S) > min|-^,yri 

This bound is tight up to logarithmic factors: it was shown in [22] that 
for every e G (0, 1) there exists S C {-1, 1}" with \S\ ^ 2'^(i-") and 



< . , ^ . log(l/5) 



The proof of (64) uses Markov type in a crucial way. Here, in order 
to illustrate the main ideas, we will deal with the analogous problem 
for subsets of graphs with large girth. Namely, let G = (V, E) be a 
finite /c- regular {k ^ 3) connected graph with girth g. Assume that 
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S* C V is equipped with the metric do inherited from G. We will prove 
the following lower bound on ce.-^{S), which is also due to [22j : 




Ci.iS)>l (65) 



lo& ( ^ 



Note that when S* = \^ we return to (58), but the proof of (65) 



is more subtle than the proof of (58). This proof uses more heavily 



the fact that in (40) we are free to choose the stationary reversible 
Markov chain as we wish. Our plan is to construct a special stationary 
reversible Markov chain on 5*, which in conjunction with the Markov 



type 2 property of Hilbert space, will establish (65) 



Ideally, we would like our Markov chain to be something like the 



standard random walk on G, restricted to S. Lemma 9.5 indicates 
that for this approach to work we need 5" to have large average degree, 
or equivalently to contain many edges of G. But, S might be very small, 
and need not contain any edge of G. We will overcome this problem 
by considering a different set of edges E' on V, which is nevertheless 
closely related to the geometry of G, such that S contains sufficiently 
many edges from E'. Before proceeding to carry out this plan, we 
therefore need to make a small digression which explains a spectral 
method for showing that a subset of a graph contains many edges. 



9.4.2. A„ and self mixing. Let H = {{1, . . . ,n}, E^) be a cZ-regular 
loop-free graph on {1, . . . ,n}. We denote by Ah = (ajj) its adjacency 
matrix, i.e., the n x n matrix whose entries are in {0, 1}, and = 1 
if and only if ij e Eh- Let Xi{H) ^ \2{H) ^ ■ ■ ■ ^ A„(if) be the 
eigenvalues of Ah- Thus Xi{H) = d, and since the diagonal entries of 
H vanish, tTace^An) = J2i=i Xi{H) = 0. In particular we are ensured 
that Xn{H) is negative. 

Let {vi, . . . ,Vn} be an eigenbasis of Ah, which is orthonormal with 
respect to the standard scalar product (■, ■) on M". We can choose the 
labeling so that Vi = ;^l{i,...,n}5 and the eigenvalue corresponding to 
Vi is Xi{H). For every S C {1, . . . ,n} let Eh{S) denote the number of 
edges in Eh that are incident to two vertices in S. Observe that 

{AhIs, Is) = E kiH){vi, Isf = ^ + E ^^(^)(^- 1^)' 

i=l 1=2 

d\S\^ X rzJ^^, \2 d\S\^ , /TJ^^^o^ 



^ ^ + X^{H) J2{v^, Is)' = ^ + X4H) (\S\ 

i=2 ^ 



n 
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Thus, since \n{H) < 0, 



2Eh{S) = {AhIs, 1s)^— + \n{H)\S\. (66) 



n 



We can use (66) to deduce that Eh{S) is large provided that Xn{H) is 
not too negative (in |22j such a bound is called a self mixing inequality) . 
The bound in ( [66| ) is perhaps less familiar than Cheeger's inequality |10l 
13], which relates the number of edges joining S and its complement to 
X2{H), but these two inequalities are the same in spirit. We refer 
to the survey [73] for more information on the connection between the 
second largest eigenvalue and graph expansion. While bounds on X2{H) 
would have been very useful for us to have in the ensuing argument to 



prove (65) (and the corresponding proof of (64) in [22]), we will only 
obtain bounds on |A„(if)| (for an appropriately chosen graph H), which 
will nevertheless suffice for our purposes. 

9.4.3. The spectral argument in the case of large girth. Returning to 



the proof of (66), let G = {V,E) be an n-vertex fc-regular connected 
graph [k ^ 3) with girth g. We assume throughout that G is loop-free 
and contains no multiple edges. As before, the shortest path metric 
on G is denoted by da- Fix m e N and let G^""^ = {V,EQ(m)) denote 
the distance m graph of G, i.e., the graph on V in which two vertices 
u,v & V are joined by an edge if and only if dciu, v) = m. 

Recall that AQ(m) denotes the adjacency matrix of G^'^\ Thus we 
have Aq(o) = ly (the identity matrix on V) and Aq(i) = Ac- Moreover, 
Aq = kly + Aq(2), and AcAQim-i) = {k — \)AQ{m-2) + AQ{m) for all 
2 < m < f. Indeed, write IAgAq,^^^!))^^ = E«,ev (^g)„^ (^g(™-i))^^ 
for all u,v G V. There are only two types of possible contributions 
to this sum: either dG{u,v) = m and w is on the unique path joining 
u and V such that uw G E, or dG{u,v) = m — 2 and w is one of the 
neighbors of v which is not on the path joining u and v (the number 
of such w equals A; if m = 2, and equals — 1 if m > 2). 

The above discussion shows that if we define a sequence of polyno- 
mials {P^(x)}~=o by 

P^{x) = l, P^{x) = x, P^ix)=x'-k, (67) 
and recursively, 

P^(x) = xP,ti(a^) -ik- l)P^2(a^), (68) 
then for all integers ^ m < |, 

= (Ag) . (69) 
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The polynomials {P^(x)}^^q are known as the Geronimus polyno- 



mials (see 1 176] and the references therein). By (69), when m < | the 



eigenvalues of AQ(m) are {P^ (Aj(ylG'))}"__^. For the purpose of bound- 
ing the negative number A„ {AQ(m)) from below, it therefore suffices to 
use the bound 



An {Acim)) ^ minP^(a;). 



(70) 



A simple induction shows that Pmi^) is a polynomial of degree m 
with leading coefficient 1, and it is an even function for even m, and an 
odd function for odd m. Moreover, we have the following trigonometric 
identity (see jl76j ): 



P^ (2V k - lcos^9 



(k-lY 



{k — 1) sin((m + — sin((m — !)'&) 
sin-i? 



(71) 



The proof of (71) is a straightforward induction: check the validity 



of (71 ) for m = 1, 2 using (67), and verify by induction that (71 ) holds 



using the recursion ( 68 ) . 

Define = 4-tt-- For every g G {0, . . . , m} we have i!}q G (0, vr) 



m+l 



and 



the sign of P^ (2\/k — 1 cost^g) is equal to the sign of 



[k — 1) sin((m + l)'&q) — sin((m — l)^q) 

= (-l)^(A; - 1) - sin 



1 



vr 

m + l \2 



q-R 



Thus for every g G {0, . . . ,m} the value P^ (2\/k — 1 cos'dq) is positive 
if q is even, and negative if q is odd. It follows that P^ must have a zero 
in each of the m intervals { [2\/k — 1 cos^9q, 2\/k — 1 cos^9g+i] } 



m—l 



q=0 



Since P^ is a polynomial of degree m, we deduce that the zeros of P^ 
are contained in the interval [—2\/k — 1, 2\/k — l] . In particular, if m 
is even, since P^{x) is an even function which tends to 00 as x — )■ 00, 
it can take negative values only in the interval \—2\/k — 1, 2\/k — Ij . 
See Figure [6] for the case = 3, m = 8. 



It follows from the above discussion, combined with (70) and (71) 



that for every even integer < m < | we have 
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Figure 6. A plot of the polynomial 

P|(x) = x^ - 15x^ + 70x^ - lOAx^ + 24. 



m) 



^ (A; — 1) 2 mill 



rn__i . (A; — 1) sin((m + l)??) — sin((m — l)"!?) 



i?e[-7r,7r] sin "i? 

(A; _ i)e— y e^''" - e-("^-2)* V e^''" 
^ -(A;-l)^"^((A;-l)(m + l)+m-l) 

^ -(/c- l)^-^A;(m + 1). (72) 

Since the degree of G*^™-* is — l)"*"^, the following corollary is a 
combination of (66) and (72). 

Corollary 9.6. Let G = (V, E) be a k-regular graph with girth g. Then 
for all even integers < m < | and for all S O V, the average degree 
in the graph induced by G^"^^ on S satisfies 

^^Gi^ ^ Ma;(A: - 1)-! -{k- l)f-'k{m + 1). 
\S\ n 

In particular, if 

n (A; — 1) 2 
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then 



^-^^'^k(k~ir-'^-^. (74) 



9.4.4. Completion of the proof of (65). Corollary 9.6, in combination 



with Lemma 9^, suggests that we should consider the stationary re- 
versible random walk on the graph induced by G^*") on S. We will 
indeed do so, and by judiciously choosing m, ( [65| will follow. 

For each v & S we denote by degQ(m)^g^{v) its degree in the graph 
induced by G^"^^ on S, i.e., the number of vertices u G S that are at 
distance m from v, where the distance is measured according to the 



original shortest path metric on G. As in (63), for f G S* we write 



^^^^ ^ degG(n.)[s]{v) ^^^^ 

2EQ(m) (S) 

Let {Zt}'^Q be the following Markov chain on S: Zq is distributed 
according to vr, and Zj+i is distributed uniformly on the degQ(m)^s]i^t) 
vertices of S at distance m from Zt (note that degQ(m)^s]{Zt) > 0, since 
Zt is distributed only on those f G S* for which tt{v) > 0). 

At time t E N we clearly have doiZo, Zt) ^ tm. In order to remain 
in the local "tree range" , we will therefore impose the assumption 

tm < |. (76) 

Assume from now on that m is divisible by 6. We first observe that for 
t as in (76), the number of neighbors w E V oi Zt_i in the graph G'^'") 
which satisfy dciw^Zo) < dciZo, Zt-i) + y is at most {k — 
Indeed, we may assume that ^^(Zo, Zt-i) > y, since otherwise for any 
such w we have 



dciw, Zq) ^ dciw, Zt-i) - dciZQ, Zt-i 



m 



= m- dciZo, Zt-i) ^ dG{ZQ, Zt-i) + ^ • 

So, assuming daiZo^Zt-i) > f and ddw^Zo) < ddZo, Zt^i) + f, 
let V be the point on the unique path joining Zq and Zt-i such that 
dciv.Zt-i) = y + 1. The path in G (whose length is m) joining w 
and Zt-i must pass through v. See Figure [7] for an explanation of this 
simple fact. Note that dciw^v) = ^ — 1, and hence the number of 
such w IS at most (A; — 1) 3 \ 
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Figure 7. If the path joining w and Zt-\ does not 
pass through v then it must touch the path joining 
Zq and at a vertex m, as depicted above. De- 
noting i = dciu, Zt-i), we have ^ ^ y- Hence, since 
dciw, Zq) = {dciZo, Zt-i) - i) + {m - i), we have 

dG{w,Zo)^dG{Zo,Zt-i) + f. 



Let N{Zt-i) denote the number of w E S with dciw, Zt-i) = m and 
dciw, Zo) < rfG(^o, ^t-i) + f . Then, 



E [dciZo, Zt)] 



degG{-)[5](^t-i) -^(^t-i) / , „ , m 
— — ^ [dG{Zo,Zt-i) + — 

N{Zt_,) 



degG('")[5](^t-ly 



(^^(^0, Zt-i) - m) 



E[(iG(Zo,Z,_0] + y-— E 



N{Z, 



t-i 



(77) 



We will estimate the last term appearing in (77) via the point-wise 



bound N{Zt-i) ^ (^—1) ^ ^ that we proved above, together with (74), 
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for which we need to assume (|73|). 
N{Zt-i) 



E 



degcim) [s]{Zt-i 



-1 



tt{v) 



degg(„)(gj(»;)>0 



. 2m 



1^1 



2Ea(^^{S) " k{k-l)f 



2n 



(7J 



By combining ( 77 ) and ( 78 ) we get the bound 



E[dG{Zo,Zt)]^E[dGiZo,Zt.i)] 



m 
3" 



Smn 



3k{k-l)f\S\ 
^E[rfG(^o,^t-i)] 



m 
6"' 



provided that 



n ^ — 



(79) 



^0) 



We can ensure that our restrictions on m, namely (73) and (80), are 
satisfied for some m x 1 + log^ (n/jS"!) that is divisible by 6. For such a 
value of m, we know that (79) is valid as long as t satisfies (76). Thus, 
by iterating ( 79 ) we see that for some t ^ g /m we have 

E [dciZo, Ztf] ^ (E [t/G(^o, Zt)]f > {tmf > g\ (81) 

If / : S' — J- £2 satisfies 

dG{x,y) ^ ||/(x) - j{y)h ^ DdG{x,y) (82) 

for all x,y & S, then it follows from the Markov type 2 property of 
Hilbert space that 



9^ ^ 



(|82llAl|40ll 

E[dGiZo,Z,y] ^^^tE [\\f{Z 



■fiZo)\\l] 

tD^E [dGiZo, Z,f] = Dhm^ x D^g (l + log, 



n 



This completes the proof of (65). 



□ 



9.4.5. Discrete groups. Let G be an infinite group which is generated by 
a finite symmetric subset S = G. Let ds denote the left invariant 

word metric induced by S on G, i.e., ds{x,y) is the smallest integer 
k ^ such that there exist si, . . . , G 5* with x~^y = S1S2 ■ ■ ■ Sk- It 
has long been established that it is fruitful to study finitely generated 
groups as geometric objects, i.e., as metric spaces when equipped with a 
word metric (see EHl HSl 112] and the references therein for an indication 
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of the large amount of literature on this topic). Here we will describe 
the role of Markov type in this context. 

Assume that the metric space {G, ds) does not admit a bi-Lipschitz 
embedding into Hilbert space, i.e., ci^{G,ds) = oo. Based on the ex- 
perience of researchers thus far, this assumption is not restrictive: it 
is conjectured in |15] that if {G, ds) does admit a bi-Lipschitz em- 
bedding into Hilbert space then G has an Abelian subgroup of finite 
index. Fix a mapping / : G — )■ £2- Note that if / is not a Lips- 
chitz function then the mapping x H- maXsg5 — f{x)\\2 must 

be unbounded on G. If we consider only mappings / which have 
bounded displacement on edges of the Cayley graph induced by S 
on G, then the fact that ce^{G,ds) = 00 must mean that if we set 
U}f{x) = midsix,y)^t \\f{x) - f{y)\\2 then ujf{t) = o{t) as t ^ 00. To 
see this consider the mapping : G — )■ £2 © ^2{G) = £2 given by 
ip{x) = f{x) © 6x- The fact ip has infinite distortion implies that / 
must asymptotically compress arbitrarily large distances in G. 

The modulus ujf{t) is called the compression function of /. If we 
manage to show that for any / : G — )• £2 the rate at which Uf{t)/t 
tends to zero must be "fast" , then we might deduce valuable structural 
information on the group G. This general approach (including the ter- 
minology that we are using) is due to Gromov (see Section 7.E in [68]). 
Here we will study a further refinement of this idea, which will yield a 
numerical invariant of infinite groups called the compression exponent. 
This elegant definition is due to Guentner and Kaminker [72], and it 
was extensively studied in recent years (see the introduction to [148] for 
background and references). We will focus here on the use of random 
walk techniques in the task of computing (or estimating) this invariant. 

The Guentner-Kaminker definition is simple to state. Given a metric 
space {Y, dy), the F-compression exponent of G, denoted ay(G), is the 
supremum of those a ^ for which there exists a Lipschitz function 
f : G ^ Y which satisfies dyifix), f{y)) ^ ds{x,y)°' for all x,y E X. 
We remark that in the notation ay (G) we dropped the explicit reference 
to the generating set S. This is legitimate since, if we switch to a 
different finite symmetric generating set S" C G, then the resulting 
word metric ds' is bi-Lipschitz equivalent to the original word metric 
ds, and therefore the F-compression exponents of S and S' coincide. 
In other words, the number ay(G) G [0, 1] is a true algebraic invariant 
of the group G, which does not depend on the particular choice of 
a finite symmetric set of generators. The parameter al^{G) is called 
the Hilbert compression exponent of G. It was shown in [8] that any 
a e [0, 1] is the Hilbert compression exponent of some finitely generated 



60 



ASSAF NAOR 



group G (see [HI 1155] for the related question for amenable groups). 
Nevertheless, there are relatively few concrete examples of groups G for 
which (yl^{G) (and a}^{G)) has been computed. We will demonstrate 
how Markov type is relevant to the problem of estimating Q!y(G). This 
approach was introduced in [13], and further refined in |147t I148j . 

We will examine the applicability of random walks to the compu- 
tation of compression exponents of discrete groups via an illustrative 
example: the wreath product of the group of integers Z with itself. 
Before doing so, we recall for the sake of completeness the definition 
of the wreath product of two general groups G, H. Readers who are 
not accustomed to this concept are encouraged to focus on the case 
G = = Z, as it contains the essential ideas that we wish to convey. 

Let G, H be groups which are generated by the finite symmetric sets 
Sg ^ G, Sh ^ H. We denote by e^, en the identity elements of G, if, 
respectively. We also denote by the function from H to G which 
takes the value ec at all points x E H. The (restricted) wreath product 
of G with H, denoted GlH, is defined as the group of all pairs (/, x) 
where f : H ^ G has finite support (i.e., f{z) = cgh^z) = ec for all 
but finitely many z E H) and x E H, equipped with the product 

{f,x){g,y) = {zh^ f{z)g{x-^z),xy) . 

GlH is generated by the set {(e^ff , x) : x G Sh} U {(5^, en) '■ y G So}, 
where Sy : H ^ G is the function which takes the value y at en and 
the value ec on H \ {en}- 

When G = G2 = {0,1}, the cyclic group of order 2, then the group 
G2IH is often called the lamplighter group on H. In this case imagine 
that at every site x E H there is a lamp, which can either be on or 
off. An element (/, x) G C2IH can be thought of as indicating that a 
"lamplighter" is located aX x E H , and / represents the locations of 
those (finitely many) lamps which are on (these locations are the sites 
y E H where f{y) = 1). The distance in C2IH between (/, x) and {g, y) 
is the minimum number of steps required for the lamplighter to start at 
X, visit all the sites z E H ioi which f{z) 7^ g{z), change f{z) to g{z), 
and end up at the site y. Here, by a "step" we mean a move from x to 
xs for some s E Sh, or a change of the state of the lamp (from on to 
off or vice versa) at the current location of the lamplighter. Thus, the 
distance between (/, x) and {g,y) is, up to a factor of 2, the shortest 
(in the metric dsf^) traveling salesman tour starting at x, covering the 
symmetric difference of the supports of / and g, and terminating at y. 
For a general group G, the description of the metric on GlH is similar, 
the only difference being that the lamps can have G different states 
(not just on or off), and the cost of changing the state of a lamp from 
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a G G to 6 G G is ds^i^a, h). See Figure 9.4.5 for a schematic description 
of the case G = H = Z. 
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Figure 8. An example of two elements in Z?Z. The 
numbers above and below the Z-axis represent two Z- 
valued finitely supported functions. The arrows indi- 
cate the location of the lamplighter in each of the cor- 
responding elements of Z?Z. In order to compute the 
distance between these elements, the lamplighter of 
the top configuration must visit the locations where 
the top values differ from the bottom values, and at 
these locations the top value must be changed to the 
bottom value. At the end of the process, the lamp- 
lighter must end up at the location indicated by the 
bottom arrow. Each movement of the lamplighter 
to a neighboring integer adds a unit cost to this pro- 
cess, and an increment or decrement of 1 to the value 
at the location of the lamplighter also incurs a unit 
cost. The distance in Z^Z is the minimum cost of 
such a process which transforms the top configura- 
tion to the bottom configuration. Thus, in the above 
example, the top lamplighter will first move one step 
to the right, incurring a unit cost, change the 7 in 
the top row to 0, incurring a cost of 7 units, take one 
step to the left (incurring a unit cost), change the 1 
in the top row to a 7 (incurring a cost of 6 units), 
and so on. 



We shall now describe an argument using random walks, showing 
that a}^{ZlZ) ^ |. This approach is due to [13]. In fact, as shown 
in |147j . al^{ZlIj) ^ |, and therefore the argument below is sharp, and 
yields the exact computation a}^{ZlZ) = |. More generally, it is shown 
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in |148j that for every p E [1,2] we have 



= (83) 



The proof of (83) when p 2 requires an additional idea that we will 
not work out in detail here: instead of examining the standard random 
walk on Z?Z one studies a discrete version of a g-stable random walk for 
every q G [p, 2] . This yields a new twist of the Markov type method: 
it is beneficial to adapt the random walk to the geometry of the target 
space, and to use random walks with unbounded increments (though. 



we have already seen the latter occur in Section 9.4.4). We refer to |147[ 
1148] for more general results that go beyond that case of Z?Z, as well 
as an explanation of the background, history, and applications of these 
types of problems. It suffices to say here that we chose to focus on the 
group Z?Z since before the introduction of random walk techniques, it 
was the simplest concrete group which resisted the attempts to compute 
its ip compression exponents. 

Consider the standard random walk {M/fj^Q on Z^Z, starting at 
the identity element. Namely, we start at ez}Z, i-e., the configuration 
corresponding to all the lamps being turned off, and the lamplighter 
being at 0. At each step a fair coin is tossed, and depending on the 
outcome of the coin toss, either the lamplighter moves to one of its 
two neighboring locations uniformly at random, or the value at the 
current location of the lamplighter is changed by +1 or —1 uniformly 
at random. 

After t steps, we expect that a constant fraction of the coin tosses 
resulted in a movement of the lamplighter, which is just a standard 
random walk on the integers Z. Thus, at time t we expect the lamp- 
lighter to be located at ± x y/i. One might also expect that during the 
walk the lamplighter spent roughly (up to constant factors) the same 
amount of total time at a definite fraction of the sites between and 
its location at time t. There are x \/i such sites, and therefore, if this 
intuition is indeed correct, we expect the time spent at each of these 
sites to be X = \/t. At each such site the value of the lamp is also 
the result of a random walk on Z, and therefore at time t we expect Wt 
to have x ^/t sites at which the value of the lamp is ± x ^/y/t = ±v^. 
This heuristic argument suggests that 

E [dzm {Wt, enz)] >Vi-^^t = tl (84) 
These considerations can indeed be made to yield a rigorous proof 



of (84); see [54J and also [168j . as well as Section 6 in [147j for an 



extension to the case of general wreath products. 
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The fact that (.2 has Markov type 2 suggests that if / : Z?Z — )■ (.2 
satisfies dnz{,x,yY < - f{y)\\2 < dzizix,y), then 

E[\\f{Wt)-fiWo)\\l] <t, 



yet due to (84) 



^mwt)-f{Wo)\\i]>t 



2a. i 



This imphes that a ^ |, as required. But, this argument is flawed: 



we are only allowed to use the Markov type 2 inequality (40) for sta- 
tionary reversible Markov chains. The Markov chain {W^ij^Q starts at 
the deterministic point ei^^ rather than at a point chosen uniformly at 
random over Z^Z. Of course, since Z^Z is an inflnite set, there is no way 
to make Wo be uniformly distributed over it. The above argument can 
be salvaged by either considering instead an appropriately truncated 
random walk starting at a uniformly chosen point from a large enough 
F0lner set of Z^Z, or by applying an argument of Aharoni, Maurey 
and Mityagin p] and Gromov [;45j (see also [148]) to reduce the prob- 
lem to equivariant embeddings, and then to prove that the Markov 
type inequahty does hold true for images of the random walk {Wt}^o 
(starting at eziz) under equivariant mappings. See [13] for the former 
approach and |147j for the latter approach. 
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